Optimal  Sensor  Threshold  Control  and  the 
Weapon  Operating  Characteristic  for 
Autonomous  Search  and  Attack  Munitions 

THESIS 

Roland  A.  Rosario,  Captain,  USAF 
AFIT/GAE/ENG/07-02 


DEPARTMENT  OF  THE  AIR  FORCE 
AIR  UNIVERSITY 

AIR  FORCE  INSTITUTE  OF  TECHNOLOGY 

Wright-Patterson  Air  Force  Base,  Ohio 


APPROVED  FOR  PUBLIC  RELEASE;  DISTRIBUTION  UNLIMITED. 


The  views  expressed  in  this  thesis  are  those  of  the  author  and  do  not  reflect  the 
official  policy  or  position  of  the  United  States  Air  Force,  Department  of  Defense,  or 
the  United  States  Government. 


AFIT/GAE/ENG/07-02 


Optimal  Sensor  Threshold  Control  and  the 
Weapon  Operating  Characteristic  eor 
Autonomous  Search  and  Attack  Munitions 


THESIS 


Presented  to  the  Faculty 

Department  of  Electrical  and  Computer  Engineering 
Graduate  School  of  Engineering  and  Management 
Air  Force  Institute  of  Technology 
Air  University 

Air  Education  and  Training  Command 
In  Partial  Fulhllment  of  the  Requirements  for  the 
Degree  of  Master  of  Science  in  Aeronautical  Engineering 


Roland  A.  Rosario,  B.S. 
Captain,  USAF 


March  2007 


APPROVED  FOR  PUBLIC  RELEASE;  DISTRIBUTION  UNLIMITED. 


AFIT/GAE/ENG/07-02 


Optimal  Sensor  Threshold  Control  and  the 
Weapon  Operating  Characteristic  eor 
Autonomous  Search  and  Attack  Munitions 


Roland  A.  Rosario,  B.S. 
Captain,  USAE 


Approved: 


/signed/  7  Mar  2007 

Dr.  Meir  Pachter  (Chairman)  date 

/signed/  7  Mar  2007 

Dr.  David  Jacques  (Member)  date 

/signed/  7  Mar  2007 


Maj  Paul  Blue  (Member) 


date 


AFIT /GAE /ENG /07-02 


Abstract 

This  thesis  considers  the  optimal  employment  of  a  wide  area  search  mnnition 
in  a  battlespace  where  a  target  is  known  to  be  nniformly  distribnted  among  false  tar¬ 
gets  which  are  Poisson  distribnted.  The  Poisson  distribntion’s  parameter  is  obtained 
from  readily  available  battlespace  intelligence.  This  work  formnlates  and  solves  the 
optimal  control  problem  for  deriving  the  optimal  sensor  threshold  schednle  in  order  to 
maximize  the  probability  of  attacking  the  target  dnring  the  battlespace  sweep  while 
constraining  the  probability  of  attacking  a  false  target.  The  efficiency  gained  by  op¬ 
timally  varying  the  sensor  threshold  is  compared  against  the  performance  achieved 
with  a  static,  optimnm  sensor  threshold  setting.  The  Weapon  Operating  Gharacter- 
istic,  the  relationship  between  maximnm  achievable  probability  of  target  attack  and 
maximnm  allowable  probability  of  false  target  attack,  is  developed. 
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Optimal  Sensor  Threshold  Control  and  the 


Weapon  Operating  Characteristic  eor 
Autonomous  Search  and  Attack  Munitions 

I.  Introduction 

1 . 1  Overview 

Ever  increasing  technological  advancements  have  substantially  contributed  to 
autonomous  technology.  In  particular,  the  aerospace  industry  has  seen  increased  re¬ 
search  and  development  efforts  towards  autonomous  unmanned  aerial  vehicles  (UAVs). 
Currently,  UAVs  perform  a  wide  range  of  wartime  (and  peacetime)  activities  including 
reconnaissance,  and  in  some  cases,  attack.  The  spectrum  of  UAVs  includes  high-value 
assets,  akin  to  modern,  multi-role  aerial  platforms,  to  inexpensive,  disposable  plat¬ 
forms  designed  to  execute  a  single  mission  or  task.  The  new  capability  afforded  by 
these  autonomous  assets  hlls  an  important  role  in  new  emerging  paradigms  charac¬ 
teristic  of  the  Western  style  of  war.  One  persistent,  almost  dogmatic,  theme  has  been 
to  “do  more  with  less” .  This  concept  is  supported  by  the  emergence  of  better  au¬ 
tonomous  technology  because,  in  many  cases,  UAVs  and  other  forms  of  autonomous 
technology  are  able  to  automate  and  perform  tasks  that  otherwise  require  intensive 
commitment  of  human  and  other  resources.  Furthermore,  autonomous  machines  are 
not  as  limited  as  humans  in  the  bandwidth  of  cooperation.  Because  of  the  benehts 
to  be  gained  by  cooperative  synergism,  cooperative  control  of  autonomous  agents, 
enabled  by  improvements  in  modern,  autonomous  systems,  is  in  parallel  development 
with  autonomous  machines. 

This  research  addresses  optimal  control  algorithms  for  UAVs  autonomously  per¬ 
forming  search  and  destroy  missions.  Cooperative  control  could  be  further  applied 
to  optimize  the  performance  of  a  swarm  of  autonomous  munitions;  however,  it  is 
desirable  to  have  each  individual  agent  acting  autonomously  before  implementing  co- 
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operative  capabilities.  The  optimal  control  aspect  is  the  focus  of  this  thesis,  namely, 
the  performance  optimization  of  individual  autonomous  agents.  This  research  follows 
previous  work  done  (mostly  at  the  Air  Force  Institute  of  Technology — AFIT)  in  simi¬ 
lar  areas.  Specihcally,  this  thesis  investigates  the  mission  efficiency  to  be  gained  from 
optimal  control  of  dynamically  varying  parameters  such  as  the  agent’s  sensor  thresh¬ 
old.  This  chapter  will  be  followed  by  a  detailed  mathematical  buildup  and  discussion 
of  the  previous  work  that  has  been  accomplished  in  this  area  and  then  by  the  actual 
methodology  and  results  on  this  research.  The  rest  of  this  chapter  will  address  the 
scope,  motivation,  historical  background,  objectives  and  a  concise  summary  of  this 
research. 

1 . 2  Scope 

Cooperative  control  is  a  relatively  new  discipline  that  covers  a  wide  range  of 
topics  dealing  with  establishing  a  scheme  of  cooperation  among  autonomous  agents.  In 
other  words,  cooperative  control  efforts  attempt  to  network  and  integrate  machines  so 
they  can  work  together  to  achieve  greater  utility  as  defined  by  their  objectives,  much  in 
the  same  way  as  humans  inherently  act  in  a  group  sharing  the  same  goal.  Cooperative 
control  includes  such  topics  as  formation  flight,  path  planning  and  automated  aerial 
refueling.  The  objective  of  these  examples  is  to  increase  the  efficiency  of  the  mission 
by  synergizing  the  efforts  of  the  involved  agents.  For  instance,  consider  cooperative 
path  planning  of  autonomous  UAVs.  The  optimal  path  for  a  single  vehicle  given  a 
set  of  objectives  is  readily  derived.  Cooperative  path  planning  seeks  to  reconfigure 
that  trajectory  to  incorporate  awareness  of  other  vehicles.  A  cooperative  path  plan 
will  incorporate  multiple  assets  into  the  overall  mission  by  positioning  each  vehicle 
to  maximize  the  overall  objective  success,  not  necessarily  with  respect  to  one  vehicle 
or  another.  In  most  cases,  the  overall  mission  efficiency  of  a  cooperative  mission  is 
greater  than  can  be  achieved  by  a  single  asset. 

The  subset  of  cooperative  control  that  this  research  addresses  is  cooperative 
search,  classification,  and  attack.  As  the  name  would  suggest,  the  aim  is  to  configure 
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agents,  each  with  the  individual  capability  to  autonomously  search  for  targets,  clas¬ 
sify  them  as  true  or  false  targets,  and  decide  to  attack  them  with  awareness  of  other 
munitions  in  the  same  area  trying  to  achieve  the  same  goal.  This  thesis  is  further 
scoped  to  address  the  optimization  of  an  individual  munition’s  performance.  Previous 
work  has  already  shown  that  substantial  mission  efficiency  may  be  gained  by  cooper¬ 
atively  controlling  a  swarm  of  such  autonomous  munitions  in  an  area  as  opposed  to 
releasing  individual  munitions  in  an  area  each  with  the  individual  search,  classifica¬ 
tion  and  attack  objective,  but  lacking  awareness  of  the  other  collocated  agents.  In  the 
future,  the  previous  work  on  cooperative  decision  making  should  be  combined  with 
the  results  of  this  thesis,  namely,  the  optimal  sensor  threshold  control  of  autonomous 
munitions,  to  achieve  increased  performance  from  an  autonomous  swarm.  Obviously 
this  scenario  is  futuristic — one  in  which  policy  makers  and  the  general  public  trust 
and  rely  upon  autonomous  machines  to  safely  and  effectively  perform  lethal,  wartime 
missions.  However,  garnering  support  and  engendering  conhdence  in  this  budding 
theory  is  one  of  the  advantages  of  this  research. 

1.3  Motivation 

There  are  ample  potential  benefits  of  this  research.  First,  this  thesis  supports 
the  paradigm  shift  introduced  above — that  modern  approaches  to  conducting  warfare 
increasingly  seek  methods  of  doing  more  with  less.  The  most  valued  resource  in  mili¬ 
tary  operations  is  the  human  resource.  When  able,  it  is  desirable  to  decrease  the  risk 
to  human  beings  as  much  as  possible.  To  this  end,  it  is  desirable  to  use  autonomous 
agents  for  as  many  tasks  as  possible,  the  prospect  of  which  is  becoming  more  and 
more  feasible  with  advances  in  technology.  At  the  same  time  that  use  of  autonomous 
machines  mitigates  the  risk  to  humans  in  hazardous  environments,  cooperative  control 
of  said  machines  is  useful  for  increasing  the  overall  mission  efficiency.  For  the  same 
reason  that  many  human-performed,  combat  air  operations  are  carried  out  in  ffights 
of  aircraft  instead  of  individual  aircraft,  cooperative  control  of  machines  carrying  the 
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same  tasks  may  result  in  increased  mission  efficiency.  Several  examples  of  this  shown 
in  previous  research  are  presented  in  chapter  II  of  this  thesis. 

Another  motivating  factor  of  this  research  is  the  fact  that  it  contributes  to 
the  cutting  edge  of  advances  in  technology.  It  belongs  to  the  set  of  research  that 
is  developing  the  mathematical  and  technical  infrastructure  for  future  realization  of 
greater  capability.  Current  trends  in  technological  advancement  and  deployment  of 
autonomous  machines  (particularly  in  the  military  aerospace  sector)  clearly  indicate 
a  future  of  greater  dependence  on  autonomous  agents.  As  recently  as  the  last  decade 
the  U.S.  Air  Force  has  progressed  from  deploying  UAVs  with  a  great  deal  of  human 
intervention  and  control,  to  greater  autonomy  of  UAVs  and  even  arming  UAVs  such 
as  the  Predator  with  lethal  weapons.  An  increasing  number  of  munitions  in  Air  Force 
inventories  around  the  world  are  capable  of  autonomously  performing  tasks  previously 
impossible  without  direct  human  intervention.  Much  like  Billy  Mitchell’s  visionary 
insight  at  the  dawn  of  airpower  in  the  United  States,  there  is  clear  indication  that  in 
the  near  future,  military  powers  will  rely  on  unsupervised,  autonomous  platforms  and 
munitions  to  carry  out  tasks,  such  as  the  search  and  destroy  mission.  This  research 
is  in  direct  support  of  this  emergent  capability. 

In  addition  to  the  futuristic  benehts  of  this  research  there  are  also  immediate 
benefits  to  be  gained  from  this  thesis  effort.  A  currently  actionable  outcome  of  this  re¬ 
search  is  a  set  of  analytical  tools  that  may  be  used  to  assess  the  effectiveness  of  current 
operations  in  realistic,  real-world  search  and  destroy  missions.  The  concepts  devel¬ 
oped  in  this  work  directly  apply  to  current  search  and  destroy  operations,  whether 
human  or  robotic.  Specihcally,  the  analytical  tool  developed  by  this  research  affords 
policy  makers  and  war  hghters  a  probabilistic  assessment  of  desired  target  kill  with 
consideration  of  the  presence  of  false  targets  (either  intentional  decoys  or  otherwise 
misidentified  targets).  The  theory  and  application  to  current  assessment  of  concepts 
of  operations  (CONOPS)  and  rules  of  engagement  (ROE)  will  be  further  developed 
and  presented  in  section  V  of  this  thesis. 
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1.4-  Background 

In  1998,  David  Jacques  and  Robert  Leblanc  first  formalized  the  stochastic  the¬ 
ory  enabling  a  more  realistic  assessment  tool  for  the  autonomous  wide  area  search 
munition  (WASM)  in  their  paper,  “Effectiveness  Analysis  for  Wide  Area  Search  Mu¬ 
nitions”  [5].  Traditionally,  the  effectiveness  of  a  given  munition  was  judged  by  the 
absolute  probability  of  kill  metric,  Pk-  The  probability  of  kill  was  a  subjective  as¬ 
sessment  that  was  bestowed  upon  a  given  munition.  The  main  disadvantage  of  this 
metric  (and  motivation  for  Jacques  and  Leblanc’s  work)  was  that  the  Pk  for  a  given 
munition  did  not  consider  the  stochastic  variation  encountered  by  a  munition  in  the 
real  world.  This  discrepancy  has  become  more  notable  and  worthy  of  consideration 
with  the  increasing  autonomy  of  munitions.  In  the  authors’  own  words,  ’’The  single 
shot  Pk  numbers  associated  with  most  direct  attack  munitions  are  not  directly  appli¬ 
cable  to  wide  area  search  munitions  because  they  do  not  account  for  the  difficulty  of 
searching  over  tens  of  square  kilometers  in  order  to  hnd  a  target  of  interest”  [5] .  The 
new  theory  incorporated  the  possibility  of  falsely  classifying  and  attacking  a  target, 
or  not  detecting  an  intended  target’s  presence  at  all.  This  probabilistic  approach  is 
necessary  and  useful  when  dealing  with  munitions  capable  of  autonomously  identi¬ 
fying  and  attacking  targets,  because  the  possibility  exists  that  the  automatic  target 
recognition  (ATR)  and  attack  algorithms  in  the  munitions  may  commit  errors  when 
subjected  to  the  stochastic  variation  present  in  the  real  world. 

Given  the  level  of  trust  necessary  to  employ  munitions  in  an  autonomous  search 
and  destroy  role  at  some  point  in  the  future,  the  success  and  hence  the  decision  to  use 
autonomous  munitions  will  have  to  be  judged  by  the  probabilistic  metric  introduced 
above.  It  will  be  impossible  to  deterministically  establish  the  effectiveness  or  success 
of  a  given  munition.  However,  with  readily  available  intelligence  information  about 
the  munition’s  area  of  operation,  probabilistic  bounds  on  the  success  and  failure  (false 
target  attack)  of  a  given  autonomous  munition  may  be  derived  which  would  enable 
war  hghters  and  policy  makers  to  make  decisions  concerning  the  use  of  the  munition. 
This  analytical  framework  has  been  one  of  the  main  emphases  of  research  in  previous 
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years.  In  addition  to  this  development,  other  work  has  been  accomplished  (using 
this  probabilistic  framework)  to  optimize  the  cooperative  behavior  of  a  swarm  of 
munitions.  Works  by  Gillen,  Dunkel,  Decker,  and  Kish  [2-4,  7]  have  all  been  aimed 
at  satisfying  this  objective.  Specifically,  their  work,  all  accomplished  at  AFIT,  has 
discovered  mission  efficiency  gains  by  the  optimization  of  various  decision  parameters 
such  as  when  to  cooperatively  versus  individually  classify  and  attack  based  on  scenario 
parameters.  Further  works  by  Jacques,  Kish  and  Pachter  [6,  9]  have  extended  the 
idea  of  optimizing  the  mission  efficiency  of  a  swarm  of  autonomous  munitions  by 
addressing  the  optimal  control  of  dynamically  varying  parameters.  These  parameters 
are  variables  that  may  be  actively  controlled  or  changed  by  the  munition  during  the 
mission.  Examples  include,  but  are  not  limited  to,  sensor  threshold,  vehicle  velocity, 
search  pattern,  sensor  swath  width,  and  ATR  parameters.  Work  on  optimal  control 
of  dynamically  variable  parameters  has  only  begun  very  recently  with  the  paper  by 
Kish,  Jacques  and  Pachter  [9].  The  main  focus  of  my  research  will  be  to  address  some 
of  the  remaining  gaps  in  this  area  of  research. 

1.5  Objectives 

The  objective  of  this  research  is  to  extend  the  results  of  the  work  on  optimal 
control  of  munition  sensor  threshold  that  Kish  produced  in  2005  [7].  His  original  work 
showed  that  increasing  mission  efficiency  was  possible  for  a  swarm  of  autonomous  mu¬ 
nitions  by  optimizing  the  sensor  threshold.  The  impact  of  this  research  is  discussed  in 
greater  detail  in  chapter  II  of  this  thesis.  The  objective  of  this  research  is  to  apply  the 
results  of  the  optimization  to  produce  a  WASM  Operating  Characteristic  (WOC) — a 
performance  metric  for  an  autonomous  WASM  in  a  battlespace  environment  with 
false  targets.  Mission  efficiency  is  gauged  by  the  probability  of  attacking  true,  in¬ 
tended  targets.  At  the  same  time  it  is  important  to  avoid  attacking  false  targets. 
In  the  case  of  a  swarm  of  single-use  munitions,  a  false  target  attack  would  result  in 
a  wasted  munition  -  that  is,  a  munition  expended  for  no  reason.  This  consequence 
is  less  severe  in  the  case  of  multiple  use  munitions,  such  as  a  platform  with  multi- 
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pie  warheads;  however,  the  scope  of  this  research  is  conhned  to  single-use  munitions. 
A  false  target  could  also  contain  adverse  political  value  such  as  a  hospital  or  civil 
structure.  Attacking  this  type  of  false  target  is  also  undesirable,  so  the  optimization 
of  the  probability  of  true  target  attack  must  be  performed  while  at  the  same  time 
constraining  the  probability  of  false  target  attack  to  an  acceptable  level. 

As  an  example  of  this  type  of  optimization,  consider  a  munition  sensitive  to  a 
particular  type  of  target.  The  munition  can  vary  its  sensitivity  to  the  unique  charac¬ 
teristics  of  the  target  which  uniquely  identify  it  as  that  type  of  target.  If  the  munition 
increases  its  threshold  such  that  it  is  less  sensitive  to  the  target’s  characteristics,  it 
will  be  more  discriminating  of  false  targets,  because  it  will  be  more  likely  to  dismiss 
false  alarms  of  targets  with  similar  attributes.  However,  the  munition  will  coinci¬ 
dentally  hamper  its  own  ability  to  detect  real  targets.  Thus  the  end  result  will  be 
a  decreased  probability  of  attacking  false  targets,  but  also  a  decreased  probability 
of  attacking  true  targets.  The  converse  may  also  be  true  if  the  threshold  is  lowered 
to  allow  consideration  of  more  targets.  In  this  case,  the  munition  will  increase  its 
probability  of  identifying  and  attacking  a  true  target,  but  it  will  at  the  same  time 
increase  the  risk  of  being  fooled  by  a  false  target.  In  addition  to  answering  the  opti¬ 
mal  threshold  balance  question,  the  threshold  optimization  also  affords  other  valuable 
insights.  For  instance,  if  a  munition  is  close  to  the  end  of  its  time  of  flight  and  it  has 
not  encountered  and  detected  any  targets  of  interest,  it  is  desirable  (optimal,  in  fact) 
to  lower  the  sensor  threshold  to  allow  consideration  of  a  greater  number  of  targets  in 
the  short  time  remaining  for  the  target.  Otherwise,  if  the  munition  keeps  its  thresh¬ 
old  high,  it  will  keep  its  probability  of  true  target  detection  and  attack  low  which 
increases  the  chances  of  wasting  the  munition.  This  scenario  is  commonly  referred 
to  as  a  go-for-broke  tactic.  Studying  the  results  of  the  optimization  and  observing 
the  implications  yields  these  insights  and  more.  A  detailed  treatment  follows  in  the 
subsequent  chapters. 
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1.6  Approach  and  Methodology 

The  approach  to  achieve  the  optimal  control  schedule  for  the  dynamically  vari¬ 
able  sensor  threshold  of  an  autonomous  munition  in  a  search  and  destroy  mission  will 
use  mathematical  optimization  techniques.  Discrete  optimization  methods  are  used 
to  corroborate  the  results  of  the  continuous-time  formulation.  The  theoretic  frame¬ 
work  established  in  the  literature,  which  is  based  on  the  Poisson  probability  law,  is 
well  suited  for  closed  form  functional  optimization  and  optimal  control  techniques. 
Special  attention  is  paid  to  the  closed  form,  continuous  time  methodology  because  it 
affords  a  great  deal  of  insight  in  the  performance  and  operating  characteristic  of  an 
autonomous  munition  operating  in  the  scenario  in  question.  Gaining  this  insight  is 
the  objective  of  this  thesis. 

In  reality,  of  course,  any  form  of  optimization  may  be  used  to  achieve  similar 
results.  In  previous  work  other  methods  such  as  the  Response  Surface  Methodology 
have  been  successfully  used  to  perform  optimization  [3,4];  however,  that  optimiza¬ 
tion  dealt  with  optimal  decision  rules,  not  optimal  control.  Standard  optimal  control 
techniques  including  Pontryagin’s  maximum  principle  and  Lagrange  multiplier  tech¬ 
niques  will  be  used  for  this  problem  since  it  enables  closed  form  optimal  solutions 
readily  achievable  considering  the  functional  form  of  the  autonomous  search  and  de¬ 
stroy  theoretical  framework  established  in  such  works  as  [6].  In  addition,  this  elegant 
optimization  technique  is  immune  to  losses  due  to  numerical  imprecision  and  resistant 
to  the  opacity  of  meaning  in  the  results  that  emerge  from  blindly  exercising  existing, 
commercial,  computational  optimal  control  algorithms. 

1.6.1  Approach  and  Methodology:  Assumptions.  The  various  scenarios 
that  describe  a  single  munition  or  multiple  autonomous  munitions  performing  an 
autonomous  search  and  destroy  mission  are  established  in  [6]  and  elaborated  in  chap¬ 
ter  II  of  this  thesis.  There  are  various  scenarios,  but  for  simplicity  and  to  facilitate 
focus  on  the  core  problem  of  dynamically  varying  parameter  optimization,  only  the 


first  scenario  will  be  analyzed.  This  scenario  is  described  by  a  single  target  uniformly 
distributed  among  a  Poisson  field  of  false  targets. 

1 . 7  Summary 

The  aim  of  this  research  is  to  establish  optimal  control  algorithms  for  the  dy¬ 
namically  varying  sensor  threshold  of  an  autonomous  munition  performing  a  search 
and  destroy  mission.  Perhaps  one  day  the  effectiveness  of  a  swarm  will  be  improved  by 
applying  methods  so  that  optimally-acting  individual  agents  may  work  cooperatively; 
however,  the  focus  of  this  thesis  remains  on  the  individual  agent.  In  addition,  this 
research  will  support  the  development  of  theory  which  directly  contributes  analytical 
tools  to  gauge  mission  effectiveness  of  current  assets,  both  manned  and  unmanned, 
performing  similar  missions  in  uncertain  environments. 
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II.  Supporting  Background  and  Basic  Principles 

2. 1  Overview 

There  is  a  great  deal  of  research  that  has  been  accomplished  in  the  held  of 
cooperative  control  which  encompasses  several  subtopics.  Various  companies,  research 
agencies  and  universities  have  accomplished  research  that  addresses  the  behavior  of 
machines  acting  as  autonomous  agents  in  environments  with  varying  degrees  of  real 
world  representativeness.  The  literature  available  in  support  of  the  research  contained 
in  this  thesis  begins  with  Jacques  and  Leblanc’s  original  work  posing  the  stochastic 
performance  evaluation  analysis  tool  of  autonomous  munitions  [5].  Further  work, 
mainly  carried  out  at  AFIT,  has  built  upon  Jacques’  theory  and  has  introduced  a 
sound,  rigorous,  theoretical  framework  for  analyzing  autonomous  UAVs  assigned  to 
a  search,  classihcation  and  attack  missions  in  a  stochastic  environment.  Further 
work  has  addressed  optimization  of  cooperative  decision  rule  parameters  as  well  as 
other  characteristics  of  the  environment  and  the  autonomous  agents  operating  within 
the  environment.  Additional  optimization  performed  includes  dynamically  varying 
parameter  optimization. 

This  chapter  will  discuss  previous  work  that  has  been  accomplished  pertaining 
to  the  objectives  of  this  research.  Previous  optimal  decision  rule  determination  as 
well  as  optimal  control  work  will  be  highlighted.  Most  of  this  previous  work  has  been 
accomplished  at  AFIT  and  this  thesis  serves  as  a  follow-on  to  that  foundation.  In  ad¬ 
dition  this  chapter  will  also  elaborate  the  theoretical  and  mathematical  foundation  of 
the  optimal  control  problem  ensuing  in  the  following  chapter.  The  chapter  concludes 
with  a  proposition  of  the  questions  left  remaining  by  the  previous  work  and  which 
gaps  this  research  is  aimed  to  address. 

2. 2  Scope 

The  topic  of  cooperative  control  implies  a  wide  range  of  research  options.  The 
many  subtopics  of  cooperative  control  for  autonomous  UAVs  include  formation  flight 
(e.g.  automated  aerial  refueling),  path  planning,  task  allocation,  and  cooperative 
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search,  classification,  and  attack.  The  focus  of  this  thesis  is  the  optimal  search  and 
attack  mission.  Each  of  the  subtopics  is  related  in  some  way  to  each  of  the  other 
topics  and  an  overall  cooperative  control  scheme  must  be  able  to  efficiently  execute 
each  one;  however,  this  research  assumes  that  parallel  behaviors  and  actions  such  as 
task  allocation  and  path  planning  have  been  solved.  What  remains  is  the  cooperative 
aspect  dealing  with  optimal,  collaborative  search,  classification,  and  attack.  This 
scenario  is  called  persistent  area  denial  by  Jacques  and  Pachter  in  [6].  Further  scoping 
the  problem,  this  research  aims  to  establish  optimal  control  schemes  for  individual 
UAVs  so  that  the  operating  characteristic  of  the  individual  autonomous  agent  may 
be  better  understood  and  incorporated  into  a  cooperative  algorithm. 

The  following  is  an  outline  of  the  previous  work  that  has  been  accomplished. 
This  information  is  presented  as  a  means  of  framing  the  current  work  in  the  context 
relative  to  the  other  research  efforts  that  have  taken  place  in  the  field  of  cooperative 
and  optimal  control. 

2.2.1  Optimal  Decision  Rules.  Using  the  same  theoretical  foundation  pre¬ 
sented  later  in  section  2.3  as  a  foundation,  work  has  been  accomplished  to  establish 
optimal  decision  making  policies  for  cooperative  versus  independent  search,  classifica¬ 
tion  and  attack.  Consider  circumstances  such  that  a  swarm  of  autonomous  munitions 
or  UAVs  carrying  munitions  is  released  over  a  battle  space.  Each  vehicle  is  capable  of 
autonomously  searching  an  area  of  the  battle  space.  The  vehicles  possess  the  ability 
to  detect  targets  with  their  array  of  sensors  and  subsequently  submit  the  sensor  data 
to  an  automatic  target  recognition  (ATR)  software  package  for  target  classification. 
This  is  how  the  vehicle  determines  if  the  detected  object  is  a  target  or  a  false  target. 
At  that  point  the  vehicle  may  choose  to  attack  the  target  or  request  a  cooperative  clas¬ 
sification  attempt  of  the  same  target  by  a  nearby  vehicle.  In  uncertain  environments, 
the  cooperative  classification  may  be  beneficial,  because  multiple  classifications  of  the 
target  will  produce  a  higher  degree  of  confidence  in  the  overall  classification.  The 
increased  confidence  will  result  in  an  increased  probability  of  attacking  true  targets 
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and  ignoring  false  ones.  Likewise  a  certain  vehicle  may  reqnest  a  cooperative  attack  if 
it  detects  a  target  and  dednces  that  it  has  a  low  probability  of  killing  it  with  a  single 
attack  or  if  the  vehicle  determines  that  the  target  is  a  high  priority. 

The  disadvantage  of  strictly  cooperative  behavior  is  that  it  requires  greater 
resonrces,  since  the  vehicle  that  was  snmmoned  (and  agreed)  to  assist  in  cooperative 
activities  forfeited  its  ability  to  continne  searching  and  possibly  detect  additional 
targets.  The  threshold  of  cooperative  activity  may  vary  snch  that  vehicles  are  more 
likely  to  accept  cooperative  behavior  reqnests  towards  the  end  of  the  mission  since 
the  probability  of  enconntering  a  target  in  the  little  remaining  space  to  be  searched 
is  minimal.  Likewise,  in  nncertain  environments  it  may  be  considered  more  optimal 
to  forfeit  search  opportnnities  in  order  to  address  cooperative  classification  attempts 
so  that  the  probability  of  avoiding  false  target  attack  is  increased.  This  may  be 
especially  important  in  politically  sensitive  environments.  The  variation  and  discovery 
of  optimal  combinations  of  all  these  parameters  is  the  essence  of  the  optimal  decision 
rule  work  that  has  been  carried  out  mainly  at  AFIT  by  Decker  [2],  Dunkel  [3],  Kish, 
Jacques,  and  Pachter  [8],  and  Gillen  [4]. 

2. 2. 1.1  Methodologies.  Gillen’s  work  specihcally  addressed  the  follow¬ 
ing  objectives  [4]: 

1.  Establish  a  methodology  for  measuring  the  expected  effectiveness  of 
a  cooperative  system  of  wide  area  search  munitions. 

2.  Develop  optimal  cooperative  engagement  decision  rules  for  a  variety 
of  realistic  scenarios. 

3.  Analyze  the  sensitivities  of  the  decision  rule  parameters  to  the  preci¬ 
sion  of  the  munition’s  ATR  algorithm,  the  lethality  of  the  warhead, 
and  the  characteristics  of  the  battleheld  (clutter  density,  target  lay¬ 
out,  etc.). 

Gillen’s  goal  was  to  find  the  optimal  combination  of  decision  parameters.  He 
used  a  computer  simulation  to  assess  the  performance  of  the  vehicles  during  the  mis¬ 
sion  (i.e.  mission  success)  as  a  function  of  the  various  decision  parameters  he  was 
tuning.  Gillen  used  an  optimization  technique  called  Response  Surface  Methodology 
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(RSM)  to  optimize  the  decision  rules.  RSM  was  particularly  useful  for  this  applica¬ 
tion  because  part  of  the  process  inherently  enabled  the  accomplishment  of  the  third 
objective  cited  above  which  was  to  analyze  the  decision  parameter  sensitivities  to 
various  scenario  parameters  [4], 

Dunkel’s  work  followed  Gillen’s  and  was  closely  related.  Dunkel’s  research  also 
used  RSM,  but  made  use  of  a  different  computer  simulation  to  accomplish  the  follow¬ 
ing  objectives  [3]: 

1.  Develop  a  simulation  that  incorporates  advantages  as  well  as  possible 
disadvantages  of  cooperative  behavior. 

2.  Determine  under  what  circumstances  (munition  and  battleheld  char¬ 
acteristic)  it  is  benehcial  to  use  cooperative  behavior  and  under  what 
circumstances  it  is  detrimental  to  use  cooperative  behavior. 

3.  Determine  the  degree  of  beneht  (if  any)  gained  from  cooperative  be¬ 
havior  over  non-cooperative  behavior. 

Both  research  efforts  effectively  showed  an  increase  in  mission  efficiency  by  the 
use  of  decision  rules  optimized  through  the  research.  In  addition,  the  latter  work 
presented  a  sound  analysis  of  the  advantages,  disadvantages,  and  general  rules  of 
thumb  concerning  the  use  of  cooperative  control  strategies. 

2.2.2  Dynamically  Varying  Parameter  Optimization.  Another  area  of  opti¬ 
mization  work  that  has  been  accomplished  involves  the  optimal  control  for  dynam¬ 
ically  varying  munition  parameters.  In  particular,  Kish’s  dissertation  [7]  solves  the 
optimal  control  problem  for  determining  the  schedule  of  velocity  and  sensor  thresh¬ 
old  to  maximize  a  munition’s  probability  of  attacking  desired  targets  and  avoiding 
attacking  false  targets.  Most  of  the  work  leading  up  to  Kish’s  dissertation  assumes 
constant  munition  parameters.  However,  the  design  of  autonomous  wide  area  search 
munitions  is  conducive  to  varying  certain  operating  parameters  in  order  to  achieve 
better  performance  as  opposed  to  carrying  out  a  mission  with  fixed,  static  settings  of 
those  parameters.  For  instance,  consider  the  case  of  dynamically  varying  a  munition’s 
sensor  threshold.  The  sensor  threshold  roughly  corresponds  to  the  sensitivity  of  the 
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sensor  array  to  detect  targets.  Lowering  the  sensor  threshold  actnally  improves  the 
chance  of  detecting  targets  bnt  in  doing  so  increases  the  likelihood  of  identifying  noise 
(false  targets)  as  trne  targets.  Alternatively,  increasing  the  sensor  threshold  decreases 
the  probability  of  misidentifying  false  targets,  bnt  also  decreases  the  overall  ability  to 
detect  targets.  The  snbject  of  this  thesis  follows  on  to  Kish’s  work  readdressing  the 
optimal  control  solution  methodology,  paying  special  attention  to  continuous  time  for¬ 
mulation  and  solution  methods,  and  interpreting  the  weapon  operating  characteristic 
results  in  a  unique  way. 

The  optimization  problem  is  stated  as  follows  [9]: 

max  Pta 

such  that  Pfta  <  PpTA^a:. 

Qualitatively  this  means  that  it  is  desirable  to  increase  the  probability  of  attacking 
desired  targets  {Pta)  while  absolutely  constraining  the  probability  of  false  target 
attack  {Pfta)-  This  problem  will  be  fully  developed  in  the  following  chapters  of  this 
thesis.  In  [7],  Kish  develops  and  solves  the  problem  for  a  variety  of  scenarios.  The 
scenarios  are  described  in  section  2.3.1  of  this  chapter. 

Kish’s  work  affords  several  valuable  insights.  First,  by  considering  various  upper 
bounds  on  the  probability  of  false  target  attack,  one  may  observe  the  trend  of  the 
vehicle’s  tendency  to  commit  to  attacking  an  object  that  it  has  identified  as  a  target. 
As  one  might  expect,  the  higher  the  acceptable  bound  on  PFTAmaa:^  more  likely  the 
munition  is  to  commit  to  an  attack  near  the  end  of  its  time  of  flight.  In  other  words,  it 
lowers  its  sensor  threshold  toward  the  end  of  its  mission  to  make  it  more  probable  that 
it  will  detect  a  target  while  at  the  same  time  increasing  the  risk  (to  the  max  acceptable 
level)  of  attacking  a  false  target.  In  the  endgame  it  might  as  well  ”go  for  broke”  since, 
for  a  single-use  munition  at  the  end  of  its  mission,  if  it  has  not  committed  to  an  attack 
it  is  wasted  [9].  Comparing  the  results  of  the  dynamic  threshold  optimization  to  the 
static  threshold  case  shows  clear  improvements  in  mission  efficiency  by  allowing  a 
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dynamically  variable  sensor  threshold.  Likewise,  Kish’s  work  shows  mission  efficiency 
improvement  by  optimally  varying  other  dynamic  parameters,  namely  search  area 
(velocity). 

2.3  Foundation 

Two  key  elements  developed  in  the  literature  are  central  to  this  research.  They 
are  the  Poisson  probability  distribution  and  the  confusion  matrix,  which  build  up  a 
framework  for  stochastic  modeling  of  an  autonomous  UAV’s  environment.  Much  of 
the  research  in  cooperative  control  to  date  has  made  deterministic  assumptions.  In 
most  cases  this  has  been  necessary  to  demonstrate  the  main  principles  of  that  research 
without  any  additional,  unnecessary  complexity.  The  stochastic  approach  attempts  to 
address  an  element  of  the  realism  associated  with  the  actual,  operational  environment 
and  develop  optimal  policies  to  execute  in  those  scenarios.  However,  before  these 
elements  can  be  considered  it  is  necessary  to  provide  a  context  by  establishing  the 
scenario. 

2.3.1  Scenarios.  In  order  to  meaningfully  characterize  a  munition’s  per¬ 
formance  it  is  necessary  to  model  the  environment  in  which  it  is  operating.  The 
battlespace  (or  operating  environment)  models  are  called  scenarios.  Each  scenario 
describes  a  different  set  of  mathematical  assumptions  including  desired  target  distri¬ 
bution  and  false  target  distribution.  In  addition  there  are  certain  other  characteris¬ 
tics  that  are  assumed  about  the  munition  search.  Those  assumptions  are  discussed 
immediately  following  the  list  of  scenarios.  The  scenarios  and  assumptions  permit  a 
tractable  problem  to  be  introduced  and  solved.  Indeed,  as  is  shown  later  in  this  thesis 
as  well  as  in  supporting  literature,  the  mathematical  assumptions  are  not  unrepresen¬ 
tative  of  the  real  world.  In  addition,  the  assumptions  and  scenarios  are  designed  to  be 
calculated  from  readily  available  battlespace  intelligence.  For  example,  the  Poisson 
probability  distribution  is  a  key  element  of  the  false  target  distribution  model,  see 
section  2.3. 1.2,  and  the  Poisson  law  parameter  turns  out  to  be  the  expected  number 


15 


of  false  targets  that  the  munition  will  encounter  during  its  battlespace  sweep.  The 
Poisson  probability  law  models  a  random  number  of  encounters  during  a  given  time 
and  is  well  suited  to  model  a  distribution  of  targets  or  false  targets,  because  without 
further  knowledge  of  the  actual  location  of  the  false  targets,  the  munition  does  not 
know  when  it  will  encounter  the  false  targets.  The  Poisson  distribution  yields  good 
results.  Further  evidence  is  presented  in  chapter  V  with  verihcation  in  simulation  and 
experiment  at  ion . 

Some  of  the  battlespace  configurations  that  a  munition  may  operate  in  are 
presented  in  [6]  and  are  listed  as  follows: 

•  Scenario  1:  A  single  target  uniformly  distributed  among  a  Poisson  held  of  false 
targets 

•  Scenario  2:  A  Poisson  held  of  targets  distributed  among  a  Poisson  held  of  false 
targets 

•  Scenario  3:  A  held  of  N  targets  uniformly  distributed  among  a  Poisson  held  of 
false  targets 

•  Scenario  4:  A  held  of  N  targets  and  M  false  targets,  both  classes  uniformly 
distributed 

•  Scenario  5:  A  held  of  N  targets  normally  distributed,  centered  on  the  origin, 
with  some  variance  a  among  a  Poisson  held  of  false  targets 

•  Scenario  6:  A  held  of  N  targets  and  M  false  targets,  both  classes  normally 
distributed,  centered  on  the  origin,  with  target  variance,  ctt,  and  false  target 
variance,  a  ft 

Kish’s  disseration  [7]  addresses  several  of  these  scenarios  as  well  as  additional  com¬ 
plexities  such  as  multiple  warhead  munitions.  However,  this  thesis  will  concentrate 
on  the  detailed  results  of  the  weapon  operating  characteristic  and  to  concentrate  on 
this  aspect  only  scenario  1  is  considered  for  this  research.  The  assumptions  that  ac¬ 
company  the  scenario  description  for  this  thesis  are  that  the  munition  has  a  single 
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warhead,  or  is  a  “single-use”  munition,  that  the  munition  operates  at  a  constant  ve¬ 
locity,  and  that  the  battlespace  search  area  is  rectangular  and  the  search  pattern  is 
exhaustive  and  non-duplicative.  In  other  words,  this  thesis  considers  a  munition  with 
a  dynamically  variable  sensor  threshold  in  a  battlespace  environment  with  a  single 
true  target  and  a  Poisson  distribution  of  false  targets.  Scenario  1  is  explained  further 
in  section  2.3. 1.1. 

2. 3. 1.1  Scenario  1.  Scenario  1  is  described  as  “a  single  target  uni¬ 
formly  distributed  amongst  a  Poisson  held  of  False  Targets  (FT)  in  a  battle  space  of 
area  Ag"  [6].  The  parameters  of  interest  are  described  in  detail  below.  The  results 
include  a  probability  of  a  true  target  being  attacked  during  the  munition’s  sweep,  the 
probability  of  mission  success  which  is  also  dependent  on  a  probability  of  kill  derived 
from  the  specihc  munition’s  characteristics  as  well  as  the  environment’s  state.  Note 
that  in  this  thesis  the  desired  target  of  interest  in  the  scenario  1  battlespace  is  often 
called  the  true  target  to  more  clearly  distinguish  it  from  false  targets.  The  results 
also  include  the  probability  of  a  false  target  being  attacked  during  the  munition’s 
sweep  and  the  aggregate  probability  of  anything  being  attacked  during  the  mission 
(and  conversely  the  probability  that  the  munition  survives  the  battle  space  sweep, 
which  in  the  case  of  a  single- use  munition  may  very  well  indicate  mission  failure).  By 
incorporating  time  intervals  and  integrating  the  aforementioned  probabilities  over  the 
total  mission  duration  additional  information  is  presented  such  as  the  longevity  of  the 
munition  in  the  case  where  it  is  expended,  the  probability  of  the  munition  lasting  for 
a  specihed  amount  of  time,  the  average  longevity  of  a  given  target  (or  false  target)  in 
the  battle  space,  and  the  average  time  for  a  target  (or  false  target)  attack  to  occur. 
These  elementary  probabilities  are  fully  developed  in  [6]. 

Figure  2.1  illustrates  the  rectangular  battlespace  search  area.  The  figure  shows 
a  munition  (recall  that  in  this  thesis  the  munition  only  has  one  target  attack  oppor¬ 
tunity,  i.e.  one  warhead)  with  velocity  v  and  sensor  swath  width  w.  The  area  A 
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-  l=vT  - ► 


vdt — ► 

V 

Figure  2.1:  Exhaustive,  non-duplicative,  rectangular  battlespace  search  area 
searched  up  to  time  t  is  expressed  as 


A  =  wvt  (2.1) 

and  the  total  battlespace  search  area  Ag  for  the  searching  occuring  during  0  <  t  <  T 
where  T  is  the  total  battlespace  search  duration  is 

As  =  wvT  (2.2) 

This  paper’s  focus  is  on  deriving  a  munition’s  optimal  sensor  threshold  setting 
schedule  to  maximize  the  probability  of  attacking  a  true  target  during  an  engage¬ 
ment  modeled  by  Scenario  1.  First,  consider  the  target  encounter.  The  true  target  is 
uniformly  distributed.  This  means  that  during  an  entire  battlespace  sweep  the  prob¬ 
ability  of  encountering  the  true  target  at  any  given  location,  that  is  area  increment,  is 
given  by  For  instance,  if  units  of  kilometers  are  chosen  to  define  the  battlespace 
search  area,  Ag,  and  Ag  =  A  km?  then  the  probability  of  encountering  the  target  in 
any  given  square  kilometer  within  the  search  area  is  |.  Likewise,  the  temporal  prob¬ 
ability  of  true  target  encounter  during  a  time  interval  of  length  dt  is  ^  where  T  is 
the  time  it  takes  to  search  the  entire  battlespace  area.  The  false  target  distribution 
is  modelled  differently;  the  explanation  follows. 
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2. 3. 1.2  Poisson  Probability  Distribution.  The  second  outcome  results 
from  encountering  a  false  target.  The  false  targets  are  distributed  according  to  a 
Poisson  probability  distribution.  The  Poisson  random  variable  has  a  sample  space,  S', 
of  all  integers  greater  than  or  equal  to  0,  and  the  probability  of  exactly  k  encounters 
is  given  by  the  Poisson  probability  law 

\  k 

P(k)  =  e~^  —  ,  /c  =  0,1,2,...  and  A>0  (2.3) 

k\ 

In  terms  of  the  false  target  distribution  the  Poisson  probability  law  gives  the  prob¬ 
ability  of  encountering  k  false  targets  within  the  search  area.  Obviously,  an  action 
that  a  munition  may  potentially  take  against  a  false  target  is  conditioned  upon  hrst 
encountering  that  target.  The  Poisson  probability  law  is  commonly  used  in  queuing 
theory  and  other  rate-of-arrival  type  problems.  This  makes  the  Poisson  probability 
law  suitable  for  describing  the  false  target  encounters  in  the  WASM  scenario.  The 
non-dimensional  Poisson  distribution’s  parameter.  A,  is  characterized  in  terms  of  den¬ 
sity  (number  of  false  targets  per  unit  area),  a  such  that  when  searching  the 

area  A 

\  =  aA  (2.4) 


The  target  density  a  can  be  readily  discerned  from  current  battlespace  intelli¬ 
gence  such  as  an  Order  of  Battle.  Let  L  equal  the  number  of  false  targets  assumed  to 
be  randomly  distributed  over  a  search  area,  Ag.  Then, 


L 


(2.5) 


Furthermore,  with  the  area  searched  up  to  time  t  from  equation  2.1,  the  Poisson  law 
parameter  is  readily  derived  from  the  available  battlespace  intelligence  and  munition 
operating  characteristics 

(2-6) 


19 


The  Poisson  probability  law  parameter  is  hence  fully  developed  with  basic  infor¬ 
mation  regarding  the  munition  and  the  battlespace.  Equation  2.3  may  now  be  applied 
to  yield  a  usable  probability.  For  instance,  to  determine  the  probability  of  attacking 
the  desired  target  {Pta)  h  is  necessary  to  know  the  probability  that  the  munition  did 
not  previously  attack  a  false  target.  The  probability  of  false  target  attack  {Pfta)  is 
the  probability  that  the  munition  encounters  a  false  target  and  incorrectly  classihes  it 
as  the  true  target.  Conversely,  the  probability  that  the  munition  does  not  attack  any 
false  targets,  thus  enabling  it  to  attack  the  true  target  when  it  encounters  it,  is  the 
probability  of  false  target  encounter  (which  is  modeled  with  the  Poisson  probability 
law)  times  the  probability  that  the  munition  correctly  classifies  the  object  as  a  false 
target.  The  probabilities  of  target  and  false  target  correct  and  incorrect  classification 
conditioned  upon  encountering  a  given  object  are  fully  explained  in  section  2.3.2  with 
the  topic  of  the  confusion  matrix.  However,  for  now,  suffice  to  say  that  the  probabil¬ 
ity  of  correctly  classifying  a  false  target  is  Pftr-  Thus  the  probability  of  attacking 
exactly  0  false  targets  in  the  search  area  A  is  the  probability  that  0  false  targets  are 
encountered,  plus  the  probability  that  exactly  one  false  target  is  encountered  and  the 
munition  correctly  classifies  it,  plus  the  probability  that  exactly  two  false  targets  are 
encountered  and  correctly  classified  and  so  on  for  for  any  number  of  potential  false 
targets  up  to  cx).  This  summation  resulting  in  the  probability  of  not  attacking  a  false 
target  {P-fta)  ^>6  expressed  as 

CO  ^  ^ 

^TTAi^)  —  (2-7) 

A:=0 

Factoring  and  simplifying  equation  2.7  and  recognizing  that 

{PftrXY  _ 
k\ 

yields 

PftaA)  = 


(2.8) 


E 
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The  probability  of  not  attacking  any  false  targets  Pp^A  ^  piece  of  the  mathemat¬ 
ical  foundation  for  the  optimal  control  problem  posed  in  chapter  III.  The  probability 
Ppj^  illustrates  how  the  Poisson  probability  law  is  used  to  generate  fundamental 
probabilities  of  interest. 

2.3.2  Confusion  Matrix  and  the  Receiver  Operating  Characteristic.  The 
second  important  element  of  the  stochastic  model  buildup  is  the  idea  of  the  confusion 
matrix.  The  notion  of  identifying,  or  classifying,  a  false  target  was  introduced  in 
section  2.3. 1.2  with  the  explanation  of  the  Poisson  probability  distribution.  The 
difference  between  real  and  false  targets  and  the  munition’s  correct  identihcation  of 
each  upon  encounter  is  really  the  crux  of  the  stochastic  model.  Non-deterministic 
outcomes  must  be  considered  if  one  hopes  to  produce  a  realistic  performance  metric 
for  an  agent  operating  in  a  stochastic  battlespace,  i.e.  the  real  world.  To  this  end,  a 
simple,  binary  confusion  matrix  is  given  below  [6]: 


Table  2.1:  Binary  confusion  matrix:  Probabilities 

of  the  munition  classifying  true  and  false  targets  con¬ 
ditioned  on  true  or  false  target  encounter. 


Declared  Object 

Encountei 
True  Target 

'ed  Object 
False  Target 

True  Target 

Ptr 

1  —  PpTR 

False  Target 

1  —  Ptr 

PpTR 

Table  2.1  shows  the  4  probabilities  associated  with  how  a  munition  will  classify 
(or  declare)  an  object  that  it  encounters  in  the  battlespace.  Complexity  can  be  added 
to  a  confusion  matrix  by  adding  different  types  of  targets.  Adding  such  complexity 
adds  one  more  row  for  each  additional,  specific  type  of  target  that  the  munition  can 
encounter  and  a  column  for  each  different  type  of  target  for  which  the  munition  has 
a  classification  template.  It  is  possible  that  there  are  more  objects  that  is  is  possible 
to  encounter  than  the  munition  knows  to  classify.  The  remainder  of  these  “unknown” 
targets  are  grouped  into  a  general  false  target  class.  Table  2.1  shows  the  most  general 
example  of  a  confusion  matrix  where  consideration  is  paid  solely  to  a  single  target 
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of  interest  and  every  other  object  that  can  possibly  confuse  the  munition’s  sensor 
including  purposefully  deceptive  false  targets  and  environmental  clutter  is  classihed 
as  a  false  target.  The  advantage  of  adding  complexity  is  that  it  allows  consideration  of 
different  types  of  target  for,  as  an  example,  assessing  the  performance  of  a  munition  in 
attacking  priority  ranked  targets.  This  thesis,  however,  will  only  consider  the  binary 
case. 

Each  of  the  values  in  the  four  cells  of  the  confusion  matrix  is  a  conditional 
probability.  The  two  fundamental  probabilities  are  on  the  diagonal.  Ptr  is  the 
probability  that  the  munition  correctly  declares  that  it  has  detected  a  true  (desired) 
target  conditioned  on  the  fact  that  it  actually  encounters  a  true  target.  Likewise, 
PpTR  is  the  probability  that  the  munition  correctly  declares  that  it  has  detected  a  false 
(undesired)  target,  such  as  a  decoy,  conditioned  on  the  fact  that  it  actually  encounters 
a  false  target.  False  targets  include  objects  that  are  intentionally  placed  to  deceive 
the  munition  as  well  as  natural  features  inherent  in  the  clutter  of  the  battlespace  that 
may  cause  the  munition  to  incorrectly  declare  the  presence  of  a  true  target.  Ptr  and 
PpTR  represent  the  two  possibilities  of  correct  target  declaration  that  a  munition  may 
make  based  on  its  associated  encounters.  This  is  why  the  columns  of  the  confusion 
matrix  must  sum  to  1,  because,  for  each  type  of  target,  true  and  false,  there  are  only 
two  possibilities  of  declaration.  The  off-diagonal  elements  are  the  error  probabilities. 
The  quantity  1  —  Ptr  is  known  as  the  false  negative  fraction,  or  the  probability  that 
the  munition  will  commit  a  false  negative  error  in  the  event  that  it  encounters  a  true 
target.  The  quantity  1  —  Pftr  is  the  false  positive  fraction,  or  the  probability  that 
the  munition  will  commit  a  false  positive  error  in  the  event  that  it  encounters  a  false 
target.  Mission  success  is  defined  by  destroying  real  targets,  thus,  the  confusion  matrix 
plays  a  critical  role  in  establishing  the  performance  characteristics  of  a  munition.  The 
assumption  is  that  anytime  a  munition  declares  a  true  target  it  will  attack  it,  and 
anytime  it  declares  a  false  target  it  will  keep  searching.  Thus,  the  error  probabilities 
are  both  detrimental  because  if  the  munition  encounters  a  true  target  and  declares  it 
false,  then  it  will  miss  the  opportunity  to  attack  the  target  resulting  in  mission  failure. 
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Likewise,  if  the  munition  encounters  a  false  target  and  declares  it  true,  it  will  attack 
the  false  target,  essentially  wasting  itself  and  eliminating  any  future  probability  of 
encountering  the  target  of  interest.  In  addition,  this  second  error  can  also  result  in 
collateral  damage  if  the  attacked  false  target  is  a  non-combatant. 

As  Jacques  and  Pachter  [6]  point  out,  the  ideal  confusion  matrix  would  be  no 
confusion  at  all,  or,  in  other  words,  a  perfect  identity  matrix.  Ones  on  the  diagonal 
and  zeros  elsewhere  would  indicate  that  all  of  the  vehicle’s  sensor  information  was 
perfect,  delivering  the  precise  nature  of  the  object  that  was  detected.  If  the  vehicle 
encountered  a  true  target,  it  would  always  attack  it  leading  to  mission  success  whereas 
if  it  encountered  a  false  target  it  would  always  declare  it  as  such  and  choose  to 
continue  searching.  Sadly,  the  perfect  case  is  purely  theoretical  since  an  ideal  confusion 
matrix  is  tantamount  to  omniscience.  The  ideal  confusion  matrix  has  no  practical 
application  because,  unfortunately,  the  imprecision  of  sensors  in  general  as  well  as 
the  inaccuracy  and  ambiguity  of  automatic  target  recognition  algorithms  means  that 
sometimes  the  vehicle  will  make  an  errant  declaration.  Errors  will  inevitably  happen  in 
actual  scenarios  which  validates  the  reasoning  behind  the  confusion  matrix  -  especially 
the  nontrivial  case  with  non-zero  off-diagonal  elements. 

In  fact,  the  true  nature  of  a  munition’s  sensor  is  decidedly  un-ideal.  Ptr  is 
like  a  threshold  that  the  munition  uses  to  discriminate  objects  that  appear  to  be  real 
targets  and  ones  that  don’t.  Note  that  in  this  example  Ptr  is  inversely  related  to  the 
sensor  threshold  level.  That  is,  lowering  the  threshold  level  will  cause  the  munition  to 
consider  more  objects  as  real  targets,  i.e.  it  will  be  less  discriminating,  which  will,  in 
turn,  increase  the  probability  that  the  munition  will  make  the  correct  declaration  when 
it  encounters  a  real  target.  However,  Prr  is  absolutely  and  inextricably  related  to  the 
false  positive  fraction.  Lowering  the  sensor’s  threshold,  i.e.  increasing  Ptr,  makes  the 
munition  less  discriminant  which  unavoidably  increases  the  munition’s  susceptibility 
to  declaring  a  false  target  as  a  true  target.  In  a  real-world  representation,  Ptr  is 
always  monotonically  increasing  with  1  —  Pftr  so  increasing  Ptr  unavoidably  pushes 
PpTR  further  from  its  ideal  value  of  1. 
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The  realistic  sensor  performance  characteristic  is  described  by  a  concept  known 
as  the  Receiver  Operating  Characteristic  (ROC).  The  ROC  is  the  relationship  between 
Ptr  and  the  false  positive  fraction.  The  ROC  that  is  nsed  in  this  work  is  extracted 
from  [9]  and  has  been  commonly  accepted  as  a  representative  sensor  characteristic 
for  the  snbject  mnnition  systems.  However,  other  ROC  relationships  may  be  used  as 
long  as  they  meet  certain  fundamental  requirements.  The  ROC  used  here  is  given  by 


1  —  PpTR  — 


Ptr 

c-  (c-  1)Ptr 


(2.9) 


The  ROC  is  parameterized  by  the  non-dimensional  scalar  c  which  is  a  function  of 
various  operational  and  design  characteristics.  Basically,  it  describes  how  well  the 
munition  system  is  able  to  discriminate  between  true  and  false  targets  at  a  given 
sensor  threshold  setting.  The  higher  the  value  of  c,  the  better.  Examples  of  aspects 
that  affect  c  include  munition  velocity,  sensor  quality,  ATR  algorithm  effectiveness, 
and  target  aspect,  i.e.  the  amount  of  pixels  that  the  sensor  is  able  to  detect  based 
on  the  target’s  exposure.  If  the  munition  flies  slower,  it  will  most  likely  be  able  to 
capture  more  information  on  a  given  potential  target  by  dwelling  its  sensor  longer 
on  the  object  which  improves  the  sensor’s  chance  of  making  a  correct  classihcation. 
Another  example  of  improving  the  value  c  is  installing  a  better  quality  sensor  or 
ATR  algorithm.  It  is  more  favorable  to  the  munition  if  the  sensor  is  able  to  better 
discriminate  target  features  without  adjusting  its  threshold.  Figure  2.2  shows  a  family 
of  ROC  curves  with  varying  values  of  c.  Note  that  as  c  increases,  the  true  to  false 
positive  ratio  becomes  more  favorable. 

Figure  2.2  also  demonstrates  the  realism  introduced  to  the  problem  by  more 
accurately  representing  munition  sensor  characteristics,  namely,  avoiding  the  impos¬ 
sible  ideal  confusion  matrix  scenario.  As  previously  mentioned,  the  concept  of  the 
ROC  is  heuristic  so  the  ROC  in  equation  2.9  is  not  the  only  ROC  that  may  be  used, 
however,  the  given  form  has  been  shown  to  be  empirically  fit  [11].  Also,  the  ROC 
used  in  this  thesis  meets  the  requirements  for  a  valid  ROC.  First,  the  curve  has  to 
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Figure  2.2:  ROC  curve  for  varying  values  of  c 

be  monotonically  increasing.  In  addition,  the  points  (0,0)  and  (1, 1)  must  exist  (and 
bound)  the  curve.  The  meaning  of  the  endpoints  is  important.  Recall  that  the  ideal 
theoretical  confusion  matrix  is  the  identity  matrix;  however,  ones  on  the  diagonal  of 
the  confusion  matrix  would  produce  the  ordered  pair  (0, 1)  on  the  ROC  curve  which 
only  exists  in  the  limit  at  c  — >  cx).  Essentially,  the  ROC  says  that  in  order  to  eliminate 
the  possibility  of  committing  a  false  positive  error,  the  munition  must  also  dismiss 
any  probability  of  detecting  a  real  target.  On  the  opposite  side,  if  the  munition  wants 
to  make  sure  to  detect  the  true  target  with  probability  1,  it  must  also  accept  that  it 
has  committed  to  attacking  anything  it  sees. 

A  real-world  munition  may  be  flown  in  an  artificial,  test  battlespace  with  rep¬ 
resentative  trne  and  false  targets.  The  freqnency  of  correct  classifications  at  varions 
sensor  threshold  settings  may  be  nsed  to  populate  various  points  which  correspond 
to  individnal  confnsions  matrices  on  a  single  ROC  curve.  A  ROC  curve  can  be  em¬ 
pirically  £t  with  eqnation  2.9  and  the  sensor  qnality  parameter  c  can  be  solved.  It  is 
imperative  that  the  sensor  package  be  characterized  well  becanse  the  optimal  sensor 
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threshold,  which  is  the  goal  of  this  thesis,  relies  just  as  heavily  on  an  accurate  sensor 
characterization  as  the  threshold  itself. 

The  munition’s  sensor  performance  at  a  hxed  threshold  is  characterized  by  a 
single  confusion  matrix.  A  ROC  curve  virtually  represents  an  inhnite  number  of 
confusion  matrices.  Adjusting  the  munition’s  sensor  threshold  varies  Ptr  and  hence 
the  munition’s  operating  point  on  the  ROC  curve  which  is  given  by  the  ordered  pair 
(1  —  PpTR,  Ptr)-  Dynamically  varying  the  sensor  threshold  moves  the  operating  point 
along  the  ROC  curve  which  changes  the  munition’s  confusion  matrix  and  the  funda¬ 
mental  characterization  of  the  munition  and  its  sensor.  The  goal  of  the  optimization 
in  this  thesis  is  to  hnd  the  optimal  schedule  for  varying  Pj-^  such  that,  for  a  given  c, 
the  munition  avoids  attacking  false  targets  and  maximizes  its  probability  of  attacking 
the  real  one. 

2.4  Summary 

Over  the  past  several  years  a  sound  theoretical  foundation  has  been  developed 
building  on  Jacques’  and  Leblanc’s  original  research  at  Eglin  AFB,  FL.  The  resulting 
framework  supports  rigorous  theory  that  provides  analytical  tools  to  assess  the  effec¬ 
tiveness  of  autonomous  UAVs  in  a  cooperative  search,  classihcation  and  attack  func¬ 
tion.  In  addition,  multiple  optimization  efforts  have  been  accomplished  which  present 
a  cooperative  decision  rule  optimization  process  as  well  as  an  analytical  framework 
for  the  resulting  optimal  decision  strategies.  Also,  optimal  control  work  has  iden- 
tihed  ideal  schedules  for  a  munition’s  dynamically  varying  parameters.  One  of  the 
key  pieces  of  work  in  the  optimal  control  area  is  Kish’s  dissertation  [7].  This  thesis 
will  address  a  subset  of  the  optimal  control  work  presented  in  [7]  by  readdressing  the 
Scenario  1  optimal  dynamic  sensor  threshold  problem  paying  special  attention  to  the 
continuous  time  formulation  and  solution  strategy  as  well  as  presenting  the  weapon 
operating  characteristic  in  a  unique  and  detailed  way.  Remaining  questions  include 
dynamic  sensor  threshold  optimization  combined  with  optimal  decision  policies  for  a 
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cooperative  search,  classification,  and  attack  mission  to  be  carried  ont  by  autonomons 
nnmanned  aerial  vehicles. 
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III.  Optimal  Control  of  Dynamically  Varying  Sensor 

Threshold 

3. 1  Chapter  Overview 

Chapter  II  presented  the  core  mathematical  foundation  from  which  the  prob¬ 
abilities  of  interest,  namely,  Pta  and  Pfta  as  a  function  of  a  munition’s  dynamic 
controls,  will  be  developed.  Velocity  can  be  varied,  but  in  this  thesis  velocity  is  as¬ 
sumed  constant  and  only  sensor  threshold  is  varied.  Holding  velocity  constant  is  a 
simplifying  assumption  that  allows  one  to  focus  on  the  weapon  operating  character¬ 
istic  (WOC)  results.  This  chapter  builds  on  the  foundation  in  chapter  II  by  posing 
and  solving  the  optimal  control  problem.  The  static  optimization  is  presented  hrst  as 
a  baseline  where  the  optimal  hxed  sensor  threshold  is  solved.  The  dynamic  optimal 
control  problem  follows  by  first  building  the  unconstrained  problem  and  then  adding 
a  constraint  on  the  maximum  allowable  probability  of  false  target  attack  {PFTA^ax)- 
This  chapter  concludes  with  the  same,  constrained  optimal  control  problem  posed  as  a 
discrete  dynamic  optimization  problem.  Solving  the  discrete  formulation  should  cor¬ 
roborate  the  results  of  the  continuous  time  solution.  Chapter  IV  presents  the  results 
of  the  optimal  control  solution,  namely  the  WOC  and  interprets  the  results.  Chap¬ 
ter  V  concludes  the  thesis  with  a  discussion  of  the  results  and  and  how  the  theory  is 
applied  to  current  operational  scenarios. 

3.2  Foundation 

The  objective  of  this  thesis  is  to  produce  an  optimal  control  time  history  max¬ 
imizing  the  probability  of  true  target  attack  in  a  given  search  space.  Thus,  from  this 
point,  temporal  relationships  will  be  adopted  and  probabilities  relating  to  incremen¬ 
tal  areas  will  be  abandoned.  Indeed,  they  are  interchangeable;  however,  in  this  work, 
probabilities  relating  to  time  will  be  used.  In  chapter  II  the  Poisson  parameter  A 
is  developed  as  a  function  of  the  area  searched.  A,  as  in  equation  2.4.  Thus,  with 
A  =  aA,  Equation  2.8  is  presented  in  terms  of  incremental  area.  In  order  to  transform 


this  to  a  probability  dependent  on  time  note  that  since  Ag  =  wvT  and  A  =  wvt, 


A  =  Ag^  (3.1) 

This  makes  sense  as  the  area  A  searched  by  the  munition  up  to  time  t  is  the  search 
time  fraction  of  the  total  battlespace  search  area  (remember  that  a  constant  velocity 
munition  is  assumed).  Furthermore,  the  overall  desired  search  area  for  the  probability 
in  equation  2.8  is  the  munition’s  entire  battlespace  search  area,  Ag,  thus  let 

A  =  aAg  (3.2) 

Combining  equations  3.1  and  3.2  yields  the  desired  parameter  of  the  Poisson  proba¬ 

bility  law 

aA  =  (3.3) 

Then,  from  equation  2.8,  the  probability  of  not  attacking  any  false  targets  as  a  function 
of  time  is  given  by 

(3.4) 

The  overall  probability  density  function  (pdf)  corresponding  to  the  probability,  f{t)  ■ 
dt,  that  the  intended  target  is  attacked  during  the  time  interval,  [t,  t  +  dt],  is  given  by 

fit)  =  (3.5) 

Another  way  of  thinking  of  equation  (3.5)  is  that  the  time  of  true  target  attack,  t,  is 
a  random  variable  and  f{t)  is  its  pdf.  By  component,  the  resulting  probability  from 
fit)  ■  dt  is  the  probability  that  the  true  target  has  been  encountered  in  that  interval 
(^)  times  the  probability  that  the  munition  correctly  classihes  the  encountered  target 
(Ppi?)  times  the  probability  that  the  munition  has  not  previously  engaged  a  false  target 
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In  the  optimal  control  problem  the  probability  of  attacking  the  true  target 
during  the  battlespace  sweep  will  be  the  objective  function  to  maximize.  However, 
the  achievement  of  this  goal  will  be  constrained  by  the  probability  of  not  attacking 
a  false  target.  Thus  the  pdf,  g(t),  for  a  false  target  attack  must  also  be  obtained. 
The  probability,  g{t)  ■  dt,  of  attacking  a  false  target  during  the  time  interval  [t,  t  +  dt] 
is  the  probability  that  the  munition  incorrectly  classihed  the  true  target  (1  —  Ptr), 
also  known  as  a  false  negative  error,  if  it  encountered  it  before  time  t,  times  the 
probability  that  the  munition  has  not  attacked  a  false  target  before  time  t 
times  the  probability  that  the  munition  encounters  a  false  target  during  the  time 
interval  [t,  t  +  dt\  and  incorrectly  classifies  it  (with  probability  1  —  Pftr),  also  known 
as  a  false  positive  error.  Thus,  the  pdf 


9{t) 


1  -  Ptrt^ 


-  Pftr) 


(3.6) 


Several  probabilities  relevant  to  the  WASM  performance  may  be  derived  from  the  two 
fundamental  probability  density  functions,  fit)  and  g{t),  including  the  probability  of 
mission  success  and  the  probability  that  the  munition  does  not  engage  anything  at 
all  resulting  in  its  survival  of  the  battlespace  sweep.  These  derivations  are  presented 
in  Jacques  and  Pachter  [6]. 


3. 3  Static  Ptr 

The  pdfs  obtained  in  Section  3.2,  lay  the  foundation  for  evaluating  the  proba¬ 
bility  Pta  of  successfully  attacking  the  intended  true  target.  The  objective  is  to  max¬ 
imize  Pta  by  optimally  manipulating  the  sensor  threshold-determined  probability  of 
target  report  Ptr  while  at  the  same  time  mitigating  the  consequence  of  increasing 
Pta^  which,  unfortunately,  is  an  undesirable  simultaneous  increase  in  the  probability 
PpTA  of  attacking  a  false  target.  For  this  investigation,  which  assumes  a  constant 
velocity  munition,  the  munition’s  single  control  variable  is  the  probability  of  target 
report,  Ptr,  which  is  equivalent  to  setting  the  munition’s  sensor  threshold.  The  first 
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step  in  understanding  the  optimal  control  problem  is  to  gain  insight  by  addressing 


the  static  optimization  problem,  namely,  the  optimal  setting  of  a  constant  Ptr- 

As  previously  mentioned,  the  objective  function  to  maximize  is  the  probability 


of  target  attack  during  0  <  t  <  T.  The  control  variable  is  PTnit),  but  for  the  static 


optimization  a  constant,  optimal  value,  P^r-,  is  chosen  for  all  t.  Furthermore,  since 
the  control  variable,  Ptr,  is  a  probability,  it  is  constrained  according  to  0  <  Ptr  <  1. 
Equation  (3.5)  is  the  pdf  for  the  true  target  attack  during  a  time  interval  of  length  dt 
beginning  at  time  t,  so  to  obtain  the  overall  probability  of  target  attack  in  the  time 
interval  of  interest  (the  entire  battlespace  sweep)  the  pdf  must  be  integrated.  Thus, 
the  performance  function  Pta  is  given  by 


(3.7) 


For  clarity,  from  here  on  the  Poisson  parameter  A,  in  equation  (3.5),  will  be  re¬ 
placed  with  XpT  to  indicate  that  it  is  the  Poisson  parameter  corresponding  to  the 
false  targets’  distribution  in  the  battlespace.  In  addition,  f{t)  should  be  in  terms 
of  the  control  variable,  Ptr,  so  the  term,  1  —  Prtr,  is  eliminated  using  the  sensor’s 
ROC — equation  (2.9).  With  these  substitutions  the  static  optimization  problem  is 
then 


(3.8) 


Non-dimensionalizing  the  time  by  setting  T  :=  1  results  in  the  payoff  function 


(3.9) 


Integrating  equation  (3.9)  yields  the  objective  function 


(3.10) 
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Equation  (3.10)  is  then  the  mission  objective  Pta  for  given  values  of  the  problem 
parameters  XpT  and  c.  One  seeks  to  select  an  optimal  static  control  setting,  Ptr,  to 
apply  throughout  the  mission. 

In  order  to  analyze  constrained  solutions,  the  expression  for  the  probability  of 
a  false  target  attack,  Pfta,  must  also  be  derived.  Following  the  same  procedure  for 
obtaining  Pta,  the  pdf  of  false  target  attacks,  g{t),  must  be  integrated.  Applying  the 
same  substitutions  as  before  for  A  and  1  —  Pftr  into  equation  (3.6)  and  integrating 
yields  the  cost  function 

Pfta{Ptr)  =  [  g{t)dt  (3-11) 

Jo 

_  _  c  -  (c  -  1)Ptr 

Xft 

The  results,  including  the  static  WOC,  are  presented  in  chapter  IV.  Using  equa¬ 
tions  3.10  and  3.12,  one  can  solve  for  the  best  possible  probability  of  target  attack 
during  a  munition’s  battlespace  sweep  given  a  maximum  allowable  probability  of  at¬ 
tacking  a  false  target.  This  single  munition  performance  metric  is  the  essence  of  the 
WOC. 

3. 4  Dynamic  Prr 

Section  3.3  presented  and  discussed  the  methodology  and  solution  for  obtaining 
the  maximum  probability  of  true  target  attack  for  a  hxed  sensor  threshold,  that  is, 
a  hxed  Ptr-  These  results  are  useful;  however,  the  design  of  wide  area  search  mu¬ 
nitions  allows  for  dynamically  varying  the  sensor’s  threshold.  It  is  thus  desirable  to 
obtain  the  optimal  dynamic  Ptr  schedule  such  that  the  mission  probability  of  target 
attack  is  maximized.  This  optimal  control  problem  is  analyzed  in  Sections  3.4.1  and 
3.4.2.  First,  the  continuous  time  formulation  and  solution  will  be  presented.  The  ele¬ 
gance  and  simplicity  of  the  Poisson  probability  distribution  permits  a  continuous  time. 


1  —  e  V 


)  Aft 


+ 
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closed-form  optimal  control  solution  to  be  obtained.  The  continuous  solution  will  be 
corroborated  in  the  following  section  by  a  discrete  time  formulation  and  numerical 
solution  using  MATLAB®. 

3.4-1  Continuous  Optimal  Control  Problem.  Similar  to  the  static  case  in 
Section  3.3,  the  unconstrained  problem  will  be  analyzed  first  followed  by  the  inclusion 
of  the  constraint  on  the  probability  of  false  target  attack. 

3. 4. 1.1  Unconstrained  Case.  The  unconstrained  optimal  control  prob¬ 
lem  statement  is 

max  Pta 

Ptr 

Recall  from  before  that  the  objective,  Pta,  is  the  integral  of  the  pdf  of  true  target 
attack  during  the  battlespace  sweep.  Recalling  equation  (3.7) 

Pta  =  [  f{t)dt 

Jo 

1  j 

ue~  Jo^PTTA^iAAldT^t  (3.13) 

Note  that  in  the  problem  formulation  the  following  notation  is  used 

A  j~) 

u  —  Ptr 


Also,  as  before,  the  objective  function  is  normalized  by  setting  T  =  1.  Finally,  note 
that  the  exponent  has  been  replaced  with  the  equivalent  integral  form  to  facilitate 
the  state  definition.  By  introducing  the  state  dynamics  as 


u 

^  =  - 1 - 

c  —  (c  —  l)u 


a;(0)  =  0,  0  <  f  <  1 


and  recognizing  that 


u 

c  —  {c  —  l)u 


dt 


(3.14) 


(3.16) 
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the  problem  statement  can  be  rewritten  as 


max 

U 

snbject  to 


the  dynamics  (3.14) 


(3.16) 


The  Hamiltonian  is  formed  by  appending  the  dynamic  constraint  to  the  objective 
with  a  costate,  A^,, 

H  =  +  A, - ^ - —  (3.17) 

c  —  [c  —  l)u 

The  costate  differential  eqnation  is 

f)  M 

K  =  =  XpTue-^^^^,  A,(l)  =  0  (3.18) 

From  eqnation  (3.18)  it  can  be  seen  that  the  costate  is  monotonically  increasing  since 
its  time  derivative  is  always  positive.  Combining  this  fact  with  the  costate  bonndary 
condition,  also  given  in  Eqnation  (3.18),  one  infers  that 

\^{t)  <  0,  0  <  t  <  1  (3.19) 


The  same  type  of  insight  can  be  derived  from  the  state  dynamics.  It  can  be  shown 
from  eqnation  (3.14)  that  the  state,  x,  is  monotonically  increasing  since  its  derivative 
is  always  positive.  Since  the  initial  valne  of  the  state  is  x(0)  =  0,  x(t)  >  0  for  all 
0  <  f  <  T.  These  insights  will  be  usefnl  in  characterizing  the  solntion.  The  optimality 
condition  is 


^  =  0  =  +  A  _ _ 

du  ^[c  —  {c—  l)uY 


(3.20) 


The  optimal  control  is  obtained  by  solving  for  n  in  eqnation  3.20  and  is  given  by 


—  (a/— Aj;c)  +  C 

c  —  1 


(3.21) 
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One  may  confirm  that  this  extremum  yields  the  desired  maximum  of  the  Hamiltonian 
by  observing 

H 

^  <  0,  0  <  M  <  1  (3.22) 

ou^ 

Substituting  the  optimal  control,  u* ,  from  equation  (3.21),  into  the  state  and 
costate  dynamics  from  equations  (3.14)  and  (3.18),  gives  the  two  point  boundary 
value  problem 

X  = - ^  ( 1 - ^  ,  a;(0)  =  0,  0  <  t  <  1  (3.23) 

Ax  =  A^rV-AxCe-^^^^^i:,  A^(l)  =  0,  0  <  t  <  1  (3.24) 

The  idea  is  to  solve  the  two  point  boundary  value  problem  posed  by  equations  (3.23) 
and  (3.24)  which  would  return  the  optimal  state  and  costate  trajectories  which  could 
then  be  used  to  plug  into  the  equation  for  the  optimal  control  in  equation  (3.21)  to 
produce  the  optimal  control  schedule.  The  solution  method  is  presented  below  where 
the  two  equations  are  reduced  to  a  single  differential  equation  that  is  a  function  of 
the  state  variable  x  and  an  initial  guess  of  the  hnal  state  value.  This  hnal  form  of 
the  TPBVP  can  easily  be  solved  using  a  single  shooting  method  especially  since  the 
state  dynamics  are  transparent  and  provide  ample  insight  as  to  which  direction  to 
adjust  the  initial  guess  and  converge  on  a  solution.  However,  there  is  a  problem  that 
is  insidiously  present  in  equations  (3.23)  and  (3.24)  which  does  not  become  apparent 
until  consideration  of  the  fact  that  the  optimal  control  schedule  is  not  continuous  in  its 
first  derivative,  i.e.  it  is  piece-wise  smooth  but  has  a  corner.  Specifically,  the  optimal 
control  is  subject  to  the  laws  of  probability  and  is  bounded  in  the  interval  [0,1]. 
This  results  in  an  inevitable  time  that  the  control  will  saturate  in  the  unconstrained 
problem.  The  principle  behind  the  control  saturation  including  the  saturation  time 
and  its  impact  on  the  problem  will  be  investigated  later  in  this  section.  For  now, 
suffice  to  say  that  the  TPBVP  in  the  form  of  equations  (3.23)  and  (3.24)  ignores  the 
existence  of  a  time  where  the  optimal  control  schedule  does  not  obey  equation  (3.21). 
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The  solution  method  following  from  equations  (3.23)  and  (3.24)  is  presented  below  as 


it  is  mathematically  correct  and  illustrates  a  solution  methodology  pitfall  that  is  easy 
to  overlook;  however,  the  solution  is  only  actually  valid  when  the  control  saturation 


time  is  identically  equal  to  1  which  only  occurs  when  c  =  1  which  is  outside  the  set 


of  valid  values  for  c.  The  recommended  solution  methodology  is  presented  at  the  end 
of  this  section. 

What  follows  is  the  faulty  solution  methodology  that  ignores  the  existence  or 
possibility  of  a  control  saturation.  In  theory  it  is  a  promising  solution  methodology 
because  the  dynamics  of  the  state  are  fairly  well  understood;  therefore,  the  TPBV 
problem  can  be  solved  using  the  single  shooting  method  with  insights  from  the  state 
dynamics  driving  the  initial  guess  for  convergence  of  the  shooting  method.  First  the 
system  of  differential  equations  is  reduced  to  a  single  differential  equation  that  is  a 
function  of  a  single  variable  and  unknown  boundary  conditions.  In  this  case,  the 
costate  differential  equation  can  be  solved  in  terms  of  x  and  a:(l)  and  substituted 
back  into  the  state  differential  equation  to  apply  the  shooting  method.  Letting 


A  \ 

y  — 


the  following  expression  is  formed  from  equation  3.24 


y/y 


(3.25) 


Recognize  that 


4 


and  that 


It  can  now  be  shown  that 


(3.26) 
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Integrating  both  sides  yields 


+  Const  =  a/c  e  2^ftx 


(3.27) 


which  reduces  to 


(3.28) 


Substituting  the  previous  definition  for  y  in  equation  3.28  results  in 


(3.29) 


The  final  step  is  to  apply  the  costate  boundary  condition,  Aa,(l)  =  0,  which  results 
in  the  costate  solution  in  terms  of  the  state  variable,  x,  and  its  unknown  boundary 
condition,  x(l). 


2 


Aa;  =  — C 


(3.30) 


Though  the  process  is  mathematically  correct  thus  far,  the  inconsistency  with 
the  requirements  for  a  valid  control  schedule,  namely  0  <  Ptr  <  1,  first  appear  in 
equation  (3.30).  Even  though  the  costate  requirement  for  a  free-final-state  optimal 
control  problem,  that  Aa,(l)  =  0,  was  enforced  in  producing  equation  (3.30),  the  un¬ 
derlying  assumption  that  is  present  is  that  the  resulting  solution  variable,  a:(l),  is  the 
final  value  of  the  state  trajectory  solution  for  equation  (3.23),  which  has  incorporated 
one  and  only  one  form  for  the  optimal  control  which  is  given  in  equation  (3.21).  The 
substitutions  that  have  led  up  to  and  supported  equation  (3.30)  do  not  permit  any 
modifications  or  modal  changes  outside  of  what  is  permitted  by  equation  (3.21).  Thus 
the  solution  is  automatically  invalidated  if  m*  =  1  for  any  time  t  <  1  which  is  demon¬ 
strated  below  to  always  occur  for  the  unconstrained  solution.  The  inconsistency  is 
not  readily  apparent  if  one  solves  the  problem  using  this  solution  methodology  for 
parameter  combinations  of  c  and  Xft  that  produce  a  saturation  time  close  to  1.  High 
expected  numbers  of  false  targets,  which  directly  corresponds  to  high  values  of  XpT, 
yield  solutions  that  saturate  late.  The  reason  that  the  inconsistency  is  not  readily 
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apparent  in  these  cases  is  because  the  invalid  solution  methodology  is  correct  for  the 
theoretical  case  where  the  saturation  time  is  identically  equal  to  1  and  the  disparity 
grows  as  the  difference  1  —tc  grows,  where  tc  is  the  saturation  time.  In  cases  where 
the  solution  is  found  for  parameter  combinations  of  c  and  \ft  that  produce  an  early 
saturation  time  the  disparity  is  obvious:  the  resulting  optimal  control  schedule  is 
clearly  outside  the  bounds  of  a  valid  probability. 

With  careful  consideration  of  the  insight  presented  above,  substituting  the  so¬ 
lution  for  \x  from  equation  (3.30)  back  into  the  state  differential  equation  (3.23),  we 
see  that  the  optimal  state  trajectory  is  given  by 


X  = 


c  —  1 


x(0)  =  0,  0  <  t  <  1 


(3.31) 


The  form  of  the  optimal  state  trajectory  from  equation  (3.31)  can  be  used  with 
the  single  shooting  method  to  make  an  initial  guess  of  x(l),  propagate  the  dynamic 
equation,  adjust  the  guess  and  hnally  converge  on  the  optimal  state  trajectory  by 
eventually  matching  the  a:(l)  guess  to  the  final,  propagated  state  value.  However, 
before  doing  this,  consider  that  as  t  — >■  1,  x(l)  —  a:  — 0.  Therefore,  it  can  be  seen 
from  equation  3.31  that 

lima;  =  oo 

t-ri 

This  curious  result  may  imply  an  irregularity  at  the  final  time.  Recalling  the  boundary 
condition  for  the  costate,  A3;(l)  =  0,  it  can  be  seen  from  equation  3.21  that 

when  PpTA  is  not  bounded,  i.e.  in  the  unconstrained  case.  Furthermore,  analyzing 
the  time  derivative  of  the  optimal  control  reveals  that  >  0  which  means  the  optimal 
control  is  monotonically  increasing.  This  is  a  usual  trait  of  optimal  control  problems, 
but  in  this  case  the  control,  u  =  Ptr,  is  a  probability,  so  at  the  critical  time  tc  it 
saturates  at  1,  its  maximum  value,  and  maintains  that  value  until  the  hnal  time. 
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u*(t) 


0  t,  1 

t 

Figure  3.1;  General  trend  of  optimal  control  and  annotation  of  critical  saturation 
time 

The  critical  time  is  what  is  referred  to  as  the  “saturation  time”  above.  In  related 
literature  [9]  this  endgame  behavior  of  the  optimal  solution  is  termed  ’’going  for 
broke.”  Intuitively,  it  makes  sense:  if  the  munition  has  not  yet  correctly  identified 
the  true  target,  and  so  far  it  has  managed  to  avoid  attacking  any  false  targets  and 
thus  destroying  itself,  the  munition  will  lower  its  sensor  threshold  (increase  Ptr)  to 
try  to  identify  anything  at  all  in  the  hnal  moments  of  the  engagement.  After  all, 
an  unused  munition  is  a  wasted  munition.  The  saturation  time  tc  depends  on  the 
expected  density  of  false  targets  in  the  battlespace  (set  by  the  value  of  Xft)-  This 
concept  will  be  further  developed  later. 

The  general  behavior  of  the  optimal  control  described  in  the  previous  paragraph 
is  illustrated  in  Figure  3.1.  In  order  to  use  the  shooting  method  to  calculate  the 
optimal  control  and/or  state  trajectory,  it  is  necessary  to  find  the  critical  time,  tc, 
when  the  optimal  control  saturates,  that  is,  Ptr  assumes  the  value  1.  The  optimal 
state  trajectory  will  then  be  propagated  in  two  parts,  the  first  part  for  the  time 
interval  0  <  t  <  tc  according  to  the  state  trajectory  determined  by  equation  (3.14) 
with  u  =  u* ,  and  the  second  part  for  the  time  interval  tc<t<l  also  determined  by 
equation  (3.14)  but  with  u*  =  1. 
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By  definition 

u*{Q  =  1  (3.32) 

Snbstitnting  this  valne  into  the  eqnation  for  the  state  dynamics,  eqnation  (3.14),  gives 

x{t)  =  1,  tc  <  t  <  1  (3.33) 

Integrating  eqnation  (3.33)  and  applying  the  hnal  condition  yields  the  following 
endgame  optimal  state  trajectory  (where  endgame  denotes  the  period  dnring  the 
battlespace  sweep  when  the  mnnition’s  sensor  threshold  is  low  as  well  as  satnrated, 
i.e.  =  1) 

x{t)  =  t  +  a;(l)  —  1,  tc<t<l  (3.34) 

The  solntion  for  tc  is  fonnd  from  the  solution  to  eqnation  (3.34)  at  time  tc  as 
well  as  by  solving  for  the  costate  solntion  at  the  same  time,  Xxitc)-  The  optimal 
endgame  state  trajectory,  eqnation  (3.34),  is  snbstitnted  into  the  costate  differential 
eqnation,  eqnation  (3.18),  along  with  u*{tc)  =  1  resulting  in 

A:.  =  A,,(l)  =  0,  4  <  ^  <  1  (3.35) 

Integrating  equation  3.35  and  applying  its  boundary  condition  gives 

\^{t)  =  (l  -  (3.36) 

Making  the  appropriate  substitutions  for  x{tc)  from  equation  3.34,  Xxitc)  from  equa¬ 
tion  3.36,  and  u*{tc)  =  1  into  the  formula  for  the  optimal  control  from  equation  3.21 
and  solving  for  tc  yields 

(,  =  l-^ln(^)  (3.37) 

Xft  \c  —  I  J 

It  is  important  to  note  several  insights  from  the  solution  for  tc-  First,  tc  is 
obviously  bounded  in  the  search  interval  between  0  and  T,  which  in  this  normalized 
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Figure  3.2:  Feasible  parameter  domain  for  c  and  XpT 


case  is  0  <  tc  <  1-  The  upper  bound,  tc  <  1,  can  be  reduced  to  yield  c  <  oo.  This 
makes  sense  by  considering  the  ROC  (equation  2.9).  A  value  of  c  =  oo  would  mean 
that  the  acquisition  sensor  was  absolutely  perfect  meaning  that  it  was  capable  of  never 
making  a  false-positive  error  while  at  the  same  time  being  able  to  discriminate  true 
targets.  This  contradicts  the  ROC  concept  as,  from  before,  the  true  target  declaration 
{Ptr)  and  the  false-positive  fraction  {1—Pftr)  are  equal  at  the  points  (0,  0)  and  (1, 1). 

The  lower  bound  of  tc  is  more  useful.  The  bound  tc  >  0  reduces  to  the  following 
direct  correspondence  between  c  and  Xpr 


(3.38) 


This  curve  is  plotted  in  Figure  3.2.  The  relationship  between  c  and  XpT  indicates  that 
one  may  not  arbitrarily  choose  corresponding  values.  As  Xft  decreases  (indicating 
that  the  munition  expects  to  see  a  sparser  density  of  false  targets)  to  very  small 
values,  the  munition  must  have  reasonably  good  sensor  characteristics  to  expect  to  see 
anything  at  all.  Likewise,  if  the  munition  is  equipped  with  an  extremely  poor  sensor 
(low  value  for  c),  it  makes  little  sense  to  release  this  munition  in  search  of  a  target 
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interspersed  among  a  low  density  of  false  targets  because  there  is  a  high  probability 
that  the  munition  will  be  wasted,  unable  to  hnd  the  target  by  the  time  the  battlespace 
search  is  over.  Indeed,  it  is  an  even  worse  decision  to  release  a  munition  with  a  poor 
sensor  to  search  an  area  with  a  high  density  of  false  targets  as  it  will  be  difficult  to 
mitigate  the  probability  of  attacking  one  while  maintaining  a  reasonable  probability 
of  attacking  the  desired  true  target.  This  undesirable  outcome  is  namely  because  as 
the  quality  of  the  sensor  c  decreases,  tc  also  decreases  indicating  a  sooner  go  for  broke 
time  which  is  the  last  action  the  munition  should  consider  in  a  battlespace  with  poor 
sensor  characteristics.  The  other  important  insight  regarding  the  parameters’  impact 
on  the  optimal  control  saturation  time  tc  is  that  lowering  the  false  target  density  Xft 
will  advance  the  saturation  time  while  increasing  the  expected  false  target  density 
will  delay  it.  In  other  words,  if  the  munition  expects  a  lower  density  of  false  targets 
it  can  afford  to  go  for  broke  sooner  without  an  undue  risk  of  encountering  any  false 
targets  during  the  remainder  of  the  mission.  In  summary,  tc  varies  proportionately 
with  c  and  Xft- 

As  previously  noted,  the  solution  method  outlined  in  equation  (3.23)  through 
equation  (3.31)  is  erroneous.  The  best  solution  method  is  to  solve  the  two  point 
boundary  value  problem  summarized  below.  The  two  differential  equations  are  com¬ 
prised  of  the  original  form  of  the  state  and  costate  differential  equations.  The  optimal 
control  is  given  in  equation  (3.21).  If  using  the  shooting  method  there  are  two  pos¬ 
sibilities:  shooting  forward  and  backward.  If  shooting  forward,  make  an  initial  guess 
for  the  costate.  Propagate  the  state  and  costate  incrementally  calculating  the  opti¬ 
mal  control  at  each  time  step  which  is  used  to  calculate  the  next  increment  of  the 
state  and  costate.  At  the  final  time  compare  the  value  of  the  costate  to  the  known 
boundary  condition,  Aa;(l)  =  0.  With  the  previously  gained  insights  on  the  state 
and  costate  dynamics,  lower  the  initial  costate  guess  if  the  final  costate  results  in  a 
positive  value.  Alternatively,  one  may  solve  the  same  problem  with  a  reverse  shoot¬ 
ing  method.  To  implement  the  reverse  shooting  method  apply  the  costate  boundary 
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condition,  A3;(l)  =  0,  and  propagate  the  state  and  costate  backwards  nntil  time  t  =  0. 
Iterate  nntil  the  initial  condition  on  the  state,  x(0)  =  0,  is  met. 

The  shooting  method  propagation  mnst  be  accomplished  in  two  parts,  from 
time  0  <  t  <  tc,  and  the  remainder,  tc  <  t  <  1.  Alternatively,  the  state  and  costate 
may  be  propagated  nntil  the  optimal  control,  a  fnnction  of  the  state  and  costate  valne 
at  each  increment  satnrates,  then  u*  =  1  nntil  the  hnal  time.  Using  this  method,  tc 
is  not  predetermined,  bnt  the  mode  changes  based  solely  on  enforcing  the  constraint 
Knax  =  1  on  the  control.  Solving  the  problem  with  or  withont  tc  predetermined  resnlts 
in  the  same  solntion. 

The  following  is  a  snmmary  of  all  the  hnal  eqnations  for  the  nnconstrained, 
continnons-time,  optimal  control  history  for  P^j^{t). 


X  = 


a;(0)  =  0,  0  <  f  <  u 

1,  x{tc)  =  X{tc),  tc<t  <1 

^  Aj,(0),  0  <t  <tc 
1  Aa;(l)  =0,  tc<t<l 


u*{x,  K)  = 


c-1 


1, 


-,  0  <  t  <  tc 

tc<t<l 


(3.39) 

(3.40) 

(3.41) 


3. 4-1 -2  Constrained  Case.  Having  obtained  the  nnconstrained  so¬ 
lntion,  it  natnrally  follows  to  seek  the  constrained  solntion  which  will  deliver  the 
optimal  control  schedule  to  maximize  the  same  objective  as  before  while  at  the  same 
time  mitigating  (i.e.  constraining)  the  probability  of  attacking  a  false  target.  With 
this  in  mind,  the  problem  statement  changes  to  the  following 


max  Pta 

U 

such  that  Pfta  <  PpTA^a. 
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Recall  the  pdf  g{t)  from  equation  (3.6) 


-fjy)  (3.42) 


Note  that  in  equation  (3.42)  the  following  substitutions  have  been  made — the  control 
u  is  dehned  as 


A  T~) 

u  —  Ptr, 


the  term,  1  —  Pftr,  has  been  replaced  with  the  ROC  curve  relationship  from  equa¬ 
tion  (2.9),  and  the  time-dependent  terms  have  been  expressed  in  their  integral  forms. 
As  before,  the  integral  form  requires  the  introduction  of  the  state  dynamics 


X  = 


u 


c  —  (c  —  1)m 


,  a:(0)  =  0,  0  <  f  <  1 


(3.43) 


y  =  u,  2/(0)  =  0,  0  <  f  <  1 


(3.44) 


Recalling  equation  (3.11)  and  substituting  the  state  dehnition  in  for  the  integral  terms 
in  g(t)  (see  equation  3.15)  yields  the  constraint 

Pfta=  f  - (3.45) 

Jo  c-{c-l)u 

where  the  battlespace  sweep  time  T  has  been  non-dimensionalized  setting  it  equal 
to  1.  The  objective  function  for  the  constrained  problem  is  modified  by  adding  the 
equality  constraint  imposed  by  the  probability  of  false  target  attack,  Pfta,  with  a 
Lagrange  multiplier,  A 

maxJ=  [  ~  (3.46) 

u  Jo  c-(c-1)m 

Note  that  in  this  formulation  the  constraint  is  appended  as  an  equality  constraint. 
This  means  that  the  solution,  u*{t),  will  only  be  optimal  insofar  as  it  is  not  more 
benehcial  in  terms  of  the  probability  of  target  attack  to  use  the  unconstrained  solution 
rather  than  the  constrained  solution  forcing  the  probability  of  false  target  attack  to 
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the  value  specified  as  the  required  PprArna^-  The  resulting  Pta  for  a  mission  is  not 
unique  in  Pfta  except  at  the  optimum  meaning  that  a  munition  can  achieve  the  same 
Pta  but  with  two  distinctly  different  outcomes  for  the  penalty,  Pfta-  Clearly,  the 
solution  that  results  in  a  lower  Pfta  is  desirable.  Choosing  the  problem  formulation 
with  Pfta  as  an  equality  constraint  as  in  equation  (3.46)  will  result  in  a  solution  that 
forces  the  resulting  Pfta  to  the  specified  value.  As  has  been  previously  shown  with 
the  ROC,  there  is  an  advantage  in  raising  the  allowable  Pfta  to  a  certain  point  since 
raising  the  value  of  the  constraint  permits  a  better  outcome  for  the  objective  functional 
as  well.  However,  at  some  point  it  is  no  longer  optimal  and  the  best  solution  that  can 
be  obtained  is  the  unconstrained  solution.  This  approach,  setting  the  constraint  as 
an  equality,  is  also  related  to  the  penalty  approach.  The  final  constraint  will  be  set 
by  tuning  the  value  of  the  Lagrange  multiplier.  A,  A  <  0,  until  the  resulting  value  for 
Pfta  matches  the  maximum  allowed  for  the  mission.  If  the  maximum  is  greater  than 
value  of  Pfta  produced  by  the  optimal  unconstrained  solution,  then  the  latter  will 
be  used  and  the  constraint  will  be  inactive.  Note  that  the  penalty  approach  method, 
namely,  posing  the  PpTAmax  constraint  as  an  equality  constraint  was  chosen  in  lieu 
of  posing  the  same  constraint  as  an  inequality.  The  complexity  in  adding  a  slack 
variable  by  posing  the  constraint  as  an  inequality  was  probably  preserved  in  the  form 
of  additional  work  to  ensure  that  for  a  given  parameter  combination  the  optimality  of 
the  solution  was  maintained.  The  conditions  to  ensure  optimality  are  presented  later 
in  this  section. 

The  Hamiltonian  for  the  constrained  case  is  formed  by  appending  the  two  con¬ 
straints  imposed  by  the  dynamics  equations,  with  their  associated  costates,  to  the 
modified  objective  function  given  in  equation  (3.46) 


H  = 

c  —  (c  —  l)u 


+ 


u 


— - h  \yU 

l)u 


(3.47) 
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Applying  Pontryagin’s  Maximum  Principle,  the  optimal  control  is  found  by  solving 
for  u*  in  the  following 


Likewise,  the  costate  differential  equations  are  found  by  taking  the  derivative  of  the 
Hamiltonian  with  respect  to  the  states 


,  A,(l)  =  0  (3.49) 


Aj/(1)  =  0  (3.50) 


dy  c  —  [c—  l)u 


In  the  interest  of  verifying  the  optimality  of  the  solution,  the  second  partial 
derivative  of  the  Hamiltonian  with  respect  to  the  control  is  given  by 


To  ensure  that  the  optimal  solution  is  indeed  a  maximum  the  sufficient  condition  is 
checked 


(3.52) 


The  sufficient  condition  is  determined  by  examining  the  various  terms  in  equation  (3.51) 
The  necessary  condition  for  the  constraint  to  be  met  according  to  the  method  of  La¬ 
grange  multipliers,  is  that  A  <  0.  In  addition,  it  can  be  determined  from  the  initial 
condition  j/(0)  =  0  and  the  bounds  on  the  control,  and  hence  y,  0<^  =  m<1  that 
0  <  2/  <  1.  The  remaining  variable  is  A^,,  which,  by  removing  the  (always  positive) 
leading  term  in  Equation  (3.51)  and  rearranging,  can  be  seen  to  meet  the  sufficient 
condition  in  Equation  (3.52)  when 


Ax  <  — A(1  —  y)XFTe 


(3.53) 
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Unlike  the  unconstrained  solution  presented  in  section  3. 4. 1.1  where  it  was  shown  that 
this  condition  was  always  met,  the  constrained  solution  may  or  may  not  be  optimal 
depending  on  the  value  of  the  states  and  costates.  The  real  insight  is  obtained  by 
looking  at  Equation  (3.53)  again  in  a  slightly  rearranged  form 


A  < 


Ax 

AFr(l  -  y)e-^^TX 


(3.54) 


In  tuning  the  value  of  the  Lagrange  multiplier  to  achieve  the  desired  PpTAmax  reducing 
(making  more  negative)  the  value  for  A  tightens  the  constraint  forcing  PpTAmax  ^ 
lower  allowable  value.  Increasing  A,  i.e.  making  it  less  negative,  increases  (relaxes) 
the  constraint  allowing  a  higher  PpTAmax-  essence,  tuning  the  value  for  A  varies 
the  penalty  imposed  by  PppA  in  the  modihed  cost  function  (equation  3.46).  As 
previously  discussed,  since  PppA^ax  is  set  up  as  an  equality  constraint  in  this  problem, 
increasing  PppAmax  beyond  a  certain  point  invalidates  the  optimality  of  the  solution 
as  the  corresponding  mission  Pta  peaks  and  then  begins  to  decrease  with  increasing 
PpTA-  At  this  point  the  second  partial  in  equation  (3.51)  switches  sign  invalidating  the 
condition  in  Equation  (3.52).  Equation  (3.54)  provides  a  bound  on  A  identifying  the 
valid  range  of  values  to  ensure  an  optimal  solution  while  meeting  the  Ppta  constraint. 
When  acquiring  a  solution,  A  may  be  tuned  to  any  value  to  adjust  the  desired  PppAmax 
constraint  as  long  as  the  value  for  A  meets  the  condition  in  Equation  (3.54). 

From  equation  (3.48)  the  optimal  control  is 


'  ^ 

/  X{1  -  y)XpT  +  X^e^PTX 

c  —  1 

f '  V 

1  +  XyC^FTX 

The  optimal  control,  u*,  is  a  function  of  4  variables:  x,  y,  A^,  and  Xy.  The  problem  is 
shaping  up  to  be  a  complex  TPBV  problem.  The  problem  can  be  simplihed  somewhat 
by  reducing  the  dependence  on  at  least  one  of  the  variables,  Xy.  This  reduction  is  only 
possible  for  the  constrained  case  for  combinations  of  parameters  (i.e.  c,  XpT,  PpTAmax) 
that  do  not  force  the  control  to  saturate  {u*  =  1)  before  the  end  of  the  mission.  For  the 
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cases  where  the  saturation  does  occur,  which  more  closely  resemble  the  unconstrained 
solution,  the  set  of  differential  equations  representing  the  states  and  costates  must  be 
integrated  in  a  bimodal  fashion.  The  reason  is  due  to  the  existence  of  the  time 
tc  which  is  always  present  in  the  unconstrained  solution.  Reducing  the  variables 
assumes  a  single  set  of  differential  equations  valid  for  time  0  <  t  <  T.  Reducing 
the  variable  dimension  in  the  following  way  ignores  the  existence  of  tc  which  is  why 
this  solution  step  must  not  be  used  for  the  unconstrained  solution,  or  for  parameter 
combinations  in  the  constrained  solution  such  that  the  control  saturates  before  the 
end  of  the  mission.  The  method  for  hnding  the  critical  time,  will  be  addressed 
later. 

For  valid  parameter  combinations  seeking  the  constrained  solution  the  following 
method  to  reduce  the  dimension  of  the  problem  makes  the  resulting  TPBV  problem 
more  tractable  in  the  event  that  solving  the  TPBV  problem  is  the  solution  method 
of  choice,  or  alternate  methods  are  unavailable.  The  target  variable  to  reduce  is  Xy. 
Observe  that  the  state  differential  equation  for  x,  equation  (3.43),  can  be  substituted 
into  the  costate  differential  equation  for  Xy,  equation  (3.50).  The  resulting  equation 
is 


Xy  =  XftXxg 


(3.56) 


Recognizing  that 


(3.57) 


and  integrating  both  sides  yields 


Xy  =  —Xe  +  Const,  Ay(l)  =  0 


(3.58) 


After  applying  the  boundary  condition  and  solving  for  the  integration  constant  the 
resulting  solution  is  in  terms  of  x  and  x(l) — a  single  state  trajectory 


Xy  =  X 


(3.59) 
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The  solution  for  \y  in  equation  (3.59)  can  be  substituted  into  the  previous  equations 
for  X,  Aa;,  and  y  =  u*,  equations  (3.43),  (3.49),  and  (3.55)  respectively.  With  these 
substitutions  the  dimension  of  the  TPBV  problem  is  reduced,  but  the  complexity 
in  terms  of  satisfying  the  boundary  conditions  imposed  by  the  equations  has  been 
preserved.  Eliminating  the  dependence  on  Xy  transferred  the  requirement  to  converge 
to  A(l)  =  0  to  x(l)  =  x(l)  where  the  final  value  in  the  x  state  trajectory  must  be 
equal  to  the  initial  guess  for  x(l)  which  is  one  of  the  independent  variables  resulting 
in  equation  (3.59). 

One  can  find  the  solution  of  the  TPBV  problem  in  x,  y,  and  A^,  by  first  choosing  a 
value  for  A.  This  value  directly  corresponds  to  the  PpTAmax  constraint.  This  value  can 
be  adjusted  later  by  tuning  A.  Remember  that  as  the  solution  of  the  state  and  costate 
differential  equations  are  propagated  it  is  important  to  continually  check  the  validity  of 
the  value  for  A  by  making  sure  that  it  meets  the  condition  specihed  in  equation  (3.54). 
The  initial  conditions  for  x  and  y  are  given  in  equations  (3.43)  and  (3.44),  respectively. 
Choose  an  initial  guess  for  Aa;(0)  and  x(l)  and  propagate,  or  integrate,  the  differential 
equations,  also  called  “shooting”.  Converging  on  the  final  solution  that  satisfies  the 
boundary  conditions  (and  for  which  optimality  is  guaranteed),  requires  iterating  the 
above  steps  until  the  conditions  have  been  met.  As  with  the  unconstrained  solution 
the  nature  of  the  state  and  costate  equations  affords  some  insight  as  to  how  to  adjust 
the  initial  guess  to  come  closer  to  the  solution  with  each  iteration.  Given  that  the 
X  state  differential  equation  is  always  positive,  x  is  monotonically  increasing  from 
x(0)  =  0.  The  correct  initial  guess  for  x(l)  lies  somewhere  between  the  guess  and 
the  actual,  final  propagated  value  of  the  x  state  trajectory.  The  variable  y  is  positive 
and  monotonically  increasing  with  the  initial  condition,  ?/(0)  =  0,  and,  as  previously 
noted,  the  sign  of  the  A^  costate  trajectory  varies  depending  on  the  optimality  of  the 
solution.  Care  must  be  taken  to  pursue  the  solution  with  insight  into  the  convergence 
of  the  solution  and  the  solution  itself.  For  instance  it  would  be  wise  to  solve  the 
unconstrained  problem  first.  If  the  Pfta  that  results  from  an  attempt  to  obtain 
the  constrained  solution  is  higher  than  that  resulting  from  the  unconstrained  case  it 
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means  that  either  the  process  is  not  converged,  one  should  choose  a  different  value  for 
A — the  PpTAmax  constraint,  or  the  unconstrained  solution  yields  the  best  performance 
that  can  be  achieved  with  the  selected  values  for  c  and  Aft- 

As  previously  noted,  the  method  outlined  above  to  obtain  the  constrained  so¬ 
lution  should  only  be  used  when  there  does  not  exist  a  time  tc  <  1  when  u*{tc)  =  1. 
This  is  the  case  for  most  constrained  solutions;  however,  the  time  tc  is  found  as  fol¬ 
lows.  Recalling  the  equation  for  the  optimal  control,  u*{t)  (3.55),  and  noting  that 
Aa;(l)  =  0  and  Ay(l)  =  0 


u*(l) 


c  —  1  - 


a/c  —  A[1  —  j/(1)]Aft 


(3.60) 


The  question  is,  for  what  value  of  A  is  «*(!)  >  1.  This  question  is  posed  mathemati¬ 
cally  as 

\Vc-  V-A[l 


1  < 


c  —  1 


\ft 


(3.61) 


The  set  of  valid  values  for  A  is  less  than  or  equal  to  zero,  so  solving  for  A,  m*(1)  >  1 
when 

-  '  |A  ,,,1  <  A  <  0  (3.62) 

cAft)!  -  y(i)J 

If  the  condition  in  equation  (3.62)  is  met  there  exists  some  time  tc  less  than  1.  As  with 
the  unconstrained  solution  method,  the  state  and  costate  solutions  may  be  propagated 
with  our  without  the  predetermination  of  tc-  If  it  is  determined  that  tc  exists,  the 
solution  may  be  propagated  until  the  control  equals  1,  which  marks  the  critical  time, 
tc-  For  the  remainder  of  the  integration  u*  =  1.  The  mode  changes  based  solely  on 
enforcing  the  constraint  for  a  valid  control,  0  <  u*  <  1.  Otherwise  the  time  tc  may  be 
predetermined  in  which  case  the  solution  is  propagated  with  u  =  u*  from  Q  <t  <tc 
and  u*  =  1  from  tc<t<l.  Both  solution  methods  yield  identical  results. 

If  tc  exists,  it  is  found  in  a  way  similar  to  the  unconstrained  case  presented  in 
Section  3. 4. 1.1.  First,  knowing  that  u*(tc)  =  1  and  substituting  it  into  the  equations 
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for  the  state  dynamics,  equations  3.43  and  3.44,  yields  the  endgame  state  trajectories 


x(t)  =  t  +  x(l)  —  I,  tc  <  t  <  1  (3.63) 

y{t)  =  t  +  y{l) -l,tc<t  <1  (3.64) 


Substituting  the  state  solutions  given  in  equations  3.63  and  3.64  into  the  costate 
differential  equations  given  in  3.49  and  3.50  and  integrating  yields  solutions  for  the 
costate  trajectories  in  the  time  interval  tc  <  t  <  1.  Substitute  the  solutions  for  x{tc), 
y{tc)i  Xxitc),  and  Xyitc)  into  the  expression  for  u*{tc)  from  equation  3.55  and  solve  for 
tc  resulting  in 


tc=l 


In 


A 


+ 


A 


FT 


Aft  c(l  4“  A)  1  -|-  A 


[1  +  AAft(1  —  2/(1))  +  A] 


(3.65) 


One  observation  to  note  is  that  the  resulting  solution  for  tc  and  the  condition  for 
the  existence  of  tc  (equation  3.62)  are  both  dependent  on  y{l)  which  is  the  integral 
of  the  control.  Intuitively,  this  indicates  that  as  the  constraint  PpTAmax  is  relaxed 
the  area  under  the  sensor  threshold  schedule  curve  increases.  For  a  given  parameter 
combination  (i.e.  c  and  Aft)  there  is  some  point  at  which  the  area  captured  by  the 
optimal  control  schedule  curve  grows  to  a  point  where  the  control  will  go  for  broke 
before  the  end  of  the  mission.  As  the  area  under  the  curve  grows  even  more  the  go 
for  broke  time  tc  occurs  earlier. 


3.4-2  Discrete  Optimal  Control  Problem.  The  two-point  boundary  value 
problem  proves  very  challenging,  especially  as  the  complexity  and  dimensionality  of 
the  optimal  control  problem  increases.  For  this  reason  an  alternate  solution  method 
will  be  demonstrated  that  entails  a  discretized  version  of  the  problem  and  a  subsequent 
solution  by  means  of  a  numerical  optimization  algorithm. 
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The  Mayer  formulation  of  the  discrete-time  optimal  control  problem  [1]  is  given 


by 


min  0[s(A^)l  =  —Pta 

u{i),i=0..N—l 

subject  to 

s(^  +  1)  = 

kkiN)]  =  PpTA  <  PpTA^a^ 


where  s  represents  the  state  vector 


X 


In  the  Mayer  form  the  path  cost  (a  sum 


y 

of  incremental  probabilities)  is  represented  as  a  single  terminal  cost.  From  Equa¬ 
tion  (3.16)  the  discretized  objective,  Pta,  becomes 

N 


0[x(iV)]  =  At  ^  u{i  -  i)e-^FTx{i-i)  (-g  gg) 

i=l 

In  the  same  way,  from  Equation  (3.45)  the  discrete  problem  constraint,  Pfta,  is 

iUN)]  =  At  1)1^  -  -  1)1'-^"*“-'’  (3-67) 

The  discretized  state  equations  are  given  by 


x{i  +  1)  =  x{i)  +  At - .  '  (3.68) 

c  —  (c  —  i)u[i) 

y{i  -|-  1)  =  y{i)  +  At  u{i)  (3.69) 


Discrete  optimal  control  problems  are  solved  by  representing  the  continuous  time 
formulation  in  terms  of  a  cost  at  each  discrete  time  step  of  interest,  or  an  overall  path 
cost  sum,  that  is  is  a  function  of  a  number  of  states  at  each  time  step  as  well  as  a  control 
vector  at  each  time  step.  The  control  vector  becomes  the  parameter  vector  to  vary  in 
the  resulting  static,  parameter  optimization  problem.  The  optimization  may  be  solved 
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by  a  number  of  algorithms,  but  for  the  purposes  of  this  investigation  MATLAB’s 
‘fmincon’  gradient-search  algorithm  proved  robust  and  fast  enough  to  accurately  and 
efficiently  find  the  optimum  control  vector  that  agreed  with  the  analytic  solution. 

3. 5  Summary 

Chapter  III  presents  and  solves  the  problem  to  determine  the  optimal  Ptr 
setting  maximizing  the  probability  of  attacking  the  true  target  and  avoiding  the  false 
target  attack  outcome.  Chapter  IV  presents  the  results  of  the  optimization,  namely, 
the  Weapon  Operating  Characteristic  (WOC)  and  its  interpretation  and  meaning 
with  regards  to  a  munition’s  performance  in  a  battlespace  with  the  presence  of  false 
targets. 
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IV.  The  Wide-Area  Search  Munition  Operating 

Characteristic 

4-1  Overview 

Chapter  III  presented  the  optimal  control  problem  and  its  solntion  for  an  op¬ 
timal  hxed  and  dynamic  Ptr-  The  objective  is  to  characterize  a  mnnition  so  as  to 
obtain  its  best  possible  probability  for  attacking  the  trne  target  in  Scenario  1  given  a 
constraint  on  the  probability  of  the  nndesirable  ontcome  of  attacking  a  false  target. 
This  chapter  presents  the  wide-area  search  mnnition  operating  characteristic  (WOC). 
First  the  static  WOC  resnlting  from  the  solntion  in  section  3.3  is  presented  followed 
by  the  dynamic  WOC  from  the  solutions  presented  in  section  3.4  and  a  comparison 
of  the  static  and  dynamic  resnlts. 

4-2  Static  Results 

Recall  equation  3.10,  the  mission  objective  Pta  for  given  values  of  Xpx  and  c. 
The  optimum,  without  concern  for  the  ensuing  Pfta,  which  will  eventually  become 
the  constraint,  may  be  found  by  solving  for  the  value  of  Ptr  that  equates  the  deriva¬ 
tive  with  respect  to  the  control,  Ptr,  of  equation  (3.10)  to  zero.  Alternatively,  one 
can  observe  the  peak  of  the  curve  plotted  in  Figure  4.1.  Figure  (4.1)  shows  a  peak  at 
P^A  =  0.535,  which,  for  Xft  =  25  and  c  =  100  maximizes  the  probability  of  target 
attack.  This  unconstrained  optimum  is  achieved  by  applying  the  fixed,  unconstrained 
optimum  sensor  threshold  corresponding  to  P^j^  =  0.723  for  the  duration  of  the  muni¬ 
tion’s  search  and  attack  mission.  The  peak  and  subsequent  decline  in  Pta  make  sense 
because  increasing  the  control  past  the  optimum  (analogous  to  lowering  the  sensor 
threshold  beyond  the  optimum  level)  substantially  inhibits  the  munition’s  probability 
of  reaching  the  true  target  (which  it  would  probably  classify  correctly  since  Ptr  is 
set  so  high)  before  encountering  a  false  target  and  incorrectly  classifying  it  (recall 
that  the  false  positive  fraction,  1  —  Prtr,  increases  with  Ptr  according  to  the  ROC 
relationship)  resulting  in  an  attack  on  the  false  target. 
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Figure  4.1:  Mission  Pta  vs  static  Ptr]  Xft  =  ‘^5,  c  =  100 

Adding  the  constraint  derived  in  equation  3.12,  Figure  (4.2)  overlays  the  prob¬ 
ability  of  false  target  attack  obtained  from  plotting  equation  (3.12)  as  a  function  of 
Ptr-  In  addition,  the  probability  of  not  attacking  anything  at  all  (i.e.  the  muni¬ 
tion  survives  the  battlespace  sweep)  as  a  function  of  Ptr  is  also  plotted.  This  curve, 
derived  from  the  resulting  probability  given  by  the  expression  1  —  Pta  —  Pfta,  is 
monotonically  decreasing,  as  expected,  just  as  the  probability  of  false  target  attack  is 
monotonically  increasing. 

Figure  4.2  shows  the  cost  that  is  incurred  in  terms  of  the  probability  of  attacking 
undesired  false  targets  in  the  battlespace  while  attempting  to  hnd  and  attack  the  true 
target.  Thus,  the  maximum  Pta  can  be  determined  for  a  given  mission  constrained 
by  a  maximum  allowable  probability  of  false  target  attack,  Pfta-  For  instance,  it  can 
now  be  seen  that  the  overall,  unconstrained,  optimal  probability  of  target  attack  for 
the  mission  (from  Figure  4.1),  =  0.535,  incurs  a  cost  of  Pfta  =  0.318.  However, 

suppose  that  the  maximum  allowable  Pfta  is  bounded  at  0.2;  the  optimal  constrained 
solution  is  now  a  static  P^^  =  0.563  with  a  resulting  P|.^  =  0.483. 

With  the  static  optimization  complete  it  is  now  possible  to  obtain  the  overall 
WASM  Operating  Characteristic  (WOC).  The  WOC  shows  the  optimum  achievable 
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Figure  4.2:  Mission  probability  of  attack  with  constant  Ptr  for  Xft  =  25,  c  =  100. 
Plots  show  probability  of  attacking  a  true  target,  a  false  target,  and  no  target  at  all. 

Pta  for  a  given  bound  on  Pfta-  This  is  analogous  to  the  classical  ROC  from  the 
theory  of  communication.  The  WOC,  however,  is  specific  to  the  munition  of  interest 
as  it  quantihes  its  overall  mission  effectiveness  with  respect  to  parameters  of  interest, 
namely  Pta  and  Pfta-  The  WOC  for  a  munition’s  optimal,  but  hxed,  sensor  threshold 
setting  is  shown  in  Figure  4.3.  This  will  also  be  the  goal  of  the  dynamic  optimization 
in  Section  3.4  in  addition  to  the  optimal  sensor  threshold  control  schedule  to  achieve 
the  best  objective/cost  tradeoff. 

The  WOC  in  Figure  4.3  corroborates  and  readily  shows  that  which  can  be 
inferred  from  Figures  4.1  and  4.2.  First,  the  WOC  is  not  a  monotonically  increasing 
function.  The  optimum,  unconstrained  Pta  is  clearly  seen  at  the  peak  of  the  curve 
which  matches  the  optimum  value  cited  earlier  as  well  as  the  corresponding  value  of 
Pfta-  In  addition,  the  WOC  clearly  shows  at  which  point  the  value  of  the  constraint, 
Pfta,  should  be  capped.  This  also  occurs  at  the  peak  of  the  curve  since  mandating  any 
further  increase  in  the  probability  of  false  target  attack  only  hinders  the  achievement 
of  the  objective,  namely  maximizing  the  probability  of  true  target  attack  (note  that 
this  only  applies  if  the  problem  is  solved  posing  the  PFTAmax  constraint  as  an  equality 
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Figure  4.3:  Static  WASM  Operating  Characteristic  (WOC);  XpT  =  ‘^5,  c=  100 

as  in  this  thesis).  Indeed  the  same  probability  of  true  target  attack  may  be  achieved 
by  two  separate  selections  of  Ptr]  however,  the  lower  solution  is  clearly  better  since 
the  higher  value  results  in  a  higher  probability  of  false  target  attack.  In  practical 
terms  this  means  that  as  the  sensor  threshold  is  reduced  {Ptr  is  increased)  there  is 
some  point  at  which  the  unconstrained  solution  should  be  used  since  it  delivers  the 
highest  probability  of  target  attack. 


4-3  Dynamic  Results 

This  section  continues  with  the  results  garnered  from  solving  the  optimal  control 
problem  solved  in  Section  3.4.  The  dynamic  WOC  sheds  a  substantial  amount  of 
insight  into  the  performance  of  wide-area  search  munitions  operating  in  a  battlespace 
environment  containing  false  targets. 

The  key  result  is  the  WASM  Operating  Characteristic,  or  WOC,  which  gives 
information  similar  to  the  classical  ROC  specihc  to  the  performance  of  an  autonomous 
search  and  attack  munition.  Comparing  the  results  presented  in  Figure  4.3  to  the  op¬ 
timal,  dynamically  varying  sensor  threshold  setting  shows  the  improvement  gained  by 
applying  the  optimal  control  approach.  Figure  4.4  compares  the  baseline  case  where 
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Figure  4.4:  Static  and  Dynamic  WOC,  XpT  =  c  =  100. 

the  problem  parameters  are  c  =  100  and  Xft  =  25.  This  is  the  same  parameter  com¬ 
bination  as  in  Figure  4.3.  There  are  several  things  to  note  from  this  example.  First, 
note  the  obvious  improvement  in  P^a  in  the  dynamic  case  which  applies  the  opti¬ 
mal  schedule  for  the  varying  sensor  threshold.  Various  parameter  combinations  show 
different  levels  of  improvement,  but  several  things  stand  out.  Optimally  varying  the 
sensor  threshold  always  produces  a  higher  probability  of  target  attack  than  maintain¬ 
ing  the  sensor  threshold  at  an  optimal,  albeit  constant  level  throughout  the  mission. 
The  improvement  is  very  noticeable  as  PpTAmax  is  increased;  however,  even  at  lower 
values  of  the  max  allowable  false  target  attack  probability,  the  Pta  resulting  from 
optimally  varying  the  sensor  threshold  is  improved,  but  it  is  too  small  to  notice  in  the 
hgure.  The  reason  for  this  is  that  as  the  constraint  is  lowered,  i.e.  a  lower  PpTAmax 
is  imposed,  the  dynamic  optimal  control  solution  {PpR  schedule)  shifts  downward  as 
well  as  flattens  looking  more  and  more  like  the  static  solution.  Indeed,  one  can  infer 
from  the  ROC  that  the  trivial  case  represented  at  the  origin  of  the  ROC,  where  the 
maximum  allowable  probability  of  false  target  attack  is  zero,  would  have  identical 
dynamic  and  static  solutions:  flat  lines  of  Ptr  =  0  from  0  <  t  <  T.  The  other  trait 
present  in  all  static  vs.  dynamic  WOC  comparisons  is  that  with  an  increase  in  the  ob- 
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jective  Pta,  there  is  also  an  increase  in  Pfta  when  the  solntion  is  not  constrained  by 
PpTAmax-  Presumably  this  would  be  acceptable  since  the  chosen  PpTAmax  is  actually 
the  maximum  allowable  probability.  This  concept  also  relates  back  to  ROC  insights. 
The  ROC  indicates  that  the  objective  and  the  penalty  unavoidably  vary  together  in 
a  way  governed  by  c,  the  parameter  that  is  set  by  the  quality  of  the  sensor,  ATR 
algorithm,  munition  velocity,  pixels  on  target,  etc.  Practically,  the  static  solution  is 
only  constrained  up  to  a  certain  point  where  it  is  no  longer  beneficial  to  allow  a  higher 
probability  of  attacking  a  false  target  during  the  mission  since  the  munition  is  already 
doing  the  most  it  can  to  that  end.  At  that  maximum,  the  unconstrained  solution  is 
used.  The  same  is  true  for  the  dynamic  solution,  however,  it  can  take  advantage  of 
a  higher  Pfta  constraint  since  the  dynamic  solution  can  achieve  a  higher  Pta  than 
the  static  one.  In  cases  where  the  PFTAmax  constraint  is  set  low  enough  that  both  the 
static  and  dynamic  solutions  are  constrained,  the  resulting  Pfta  for  both  solution 
cases  is  equal,  but  the  dynamic  solution  still  yields  a  higher  Pta- 

Another  point  to  note  in  Figure  4.4  is  that  the  WOC  curves  flatten  at  a  certain 
point  corresponding  to  the  transition  to  the  unconstrained,  optimal  Ptr  schedule. 
Fortunately  it  is  intuitive,  but  it  is  important  to  remember  because  in  the  following 
hgures  the  WOC  will  be  presented  only  as  the  Pta,  PpTAmax  relationship.  The  Pfta, 
PpTAmax  relationship  is  always  one  of  equality  until  the  breakpoint  where  Pfta  re¬ 
mains  constant  for  the  remaining  values  of  PFTA^ax-  Thus  for  any  desired  value  of 
PpTAmax  resulting  Pfta  may  also  be  inferred  by  just  looking  at  the  single  Pta, 
PpTA^xx  WOC. 

The  following  hgures  are  surface  plots  illustrating  several  WOCs.  Each  individ¬ 
ual  WOC  is  a  slice  of  the  surface  in  the  PFTAmax  -Pta  plane.  The  WOCs,  and  therefore 
the  surface,  vary  with  the  sensor’s  parameter  c.  Each  surface  varies  with  the  number 
of  assumed  false  targets,  which  is  specihed  by  Aft- 

The  hrst  thing  to  note  from  Figures  4.5,  4.6,  and  4.7  is  the  WOC  trend  as  c 
varies.  Holding  Aft  constant,  the  surface  rises  with  c.  This  means  that  the  WOC 
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Figure  4.6:  Dynamic  WOC  surface,  Aft  =  5. 
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Figure  4.7:  Dynamic  WOC  surface,  XpT  =  25. 

becomes  more  favorable  with  higher  values  of  c.  The  munition  can  achieve  higher 
probabilities  of  true  target  attack  without  sacrificing  as  much  in  terms  of  the  proba¬ 
bility  of  attacking  a  false  target.  This  makes  sense  because  higher  values  of  c  mean 
that  the  munition’s  sensor  is  more  capable  of  distinguishing  true  targets  from  the 
chaff  without  committing  false  positives.  The  result  is  that  the  munition  implements 
higher  Pj^ji  schedules,  or  goes  for  broke  earlier,  without  an  undue  risk  of  making  a 
false  positive  error  on  an  unintended  (false)  target.  Higher  values  of  c  are  realized  by 
making  it  easier  on  the  munition’s  sensor,  that  is,  flying  lower  or  slower  and  effecting 
more  pixels  (or  observation  time)  on  each  potential  target,  improving  the  automatic 
target  recognition  algorithm,  or  improving  the  quality  of  the  sensor  itself.  Another 
point  to  note  is  that  with  increasing  c  the  WOC  curve  gets  steeper.  This  translates 
directly  to  the  effect  on  the  ROC  curve  with  increasing  values  of  c.  The  munition 
performs  better  without  being  subject  to  higher  values  of  PpTAmax-  Another  way  to 
think  of  this  is  that  with  higher  c  the  munition  achieves  its  unconstrained  best  at 
lower  values  of  PpTAmax-  Otherwise,  with  poor  sensor  characteristics,  the  only  way 
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for  the  munition  to  improve  its  objective  (Pta)  is  to  eat  up  more  Pfta — the  essence 
of  the  ROC  (and  hence  WOC)  relationship. 

The  other  important  insight  garnered  from  the  WOC  surfaces  is  the  trend  that 
occurs  with  changes  in  the  battlespace  environment,  namely,  Xft-  As  Xft,  or  the 
expected  number  of  false  targets  in  the  battlespace,  increases  the  plotted  surface 
lowers.  Also,  the  steepness  of  the  WOC  decreases,  i.e.,  the  value  of  PFTA^ax 
munition  achieves  its  unconstrained  best  increases.  In  the  interest  of  making  sound 
operational  decisions,  one  can  observe  this  shift  and  obtain  a  feel  for  the  ratio. 
One  might  decide  that  this  ratio  should  be  no  less  than  4,  for  example,  indicating 
that  the  munitions  ability  to  classify  true  targets  has  to  be  at  least  as  good  as  a 
certain  level  dictated  by  the  ratio  relative  to  the  expected  number  of  false  targets  in 
the  battlespace.  This  is  a  direct  way  that  this  theoretical  research  affects  the  policy 
of  conducting  autonomous  search  and  attack  operations. 

The  downward  shift  in  the  WOC  surface  with  increasing  Xft  indicates  that  if 
the  value  chosen  for  Xft  is  an  over-estimate  of  the  actual  number  of  false  targets  in  the 
battlespace  the  probability  of  false  target  attack  will  always  be  less  than  the  specihed 
PpTAmax-  It  longer  be  optimal,  but  it  will  most  likely  be  close  to  optimal, 

and  more  importantly  there  is  insurance  that  the  maximum  allowable  probability  of 
attacking  an  unintended  object  will  be  upheld.  In  other  words,  in  the  presence  of 
uncertainty  as  to  the  density  of  actual  false  targets  the  munition  will  encounter  in 
the  battlespace,  it  is  wiser  to  overestimate  the  expected  number  in  order  to  preserve 
PpTAmax-  Mathematically,  this  is  expressed  as 

Pfta  <  PpTA^ax  ^  ^ft  <  XpPmax  (4-1) 


4.4  Summary 

The  WOC  is  the  performance  characteristic  for  a  properly  characterized  mu¬ 
nition  (quantified  by  the  value  for  c)  assigned  to  attack  a  target  in  a  battlespace 
containing  false  targets.  The  WOC  surfaces  include  the  sensor  quality  information 
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as  well  allowing  the  potential  for  observing  the  characteristic  for  a  range  of  c  if  there 
is  some  uncertainty  in  the  weapon  characterization.  The  following  chapter  discusses 
the  application  of  these  results  as  well  as  simultaneous  efforts  in  simulation  and  ex¬ 
perimentation  that  build  on  the  theory  presented  in  this  thesis. 
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V.  Conclusions 


5. 1  Overview 

Applying  optimal  control  techniques  to  the  autonomous  munition  scenario  is  not 
only  fascinating  but  practically  applicable  as  well.  Two  concurrent  theses  written  at 
AFIT  have  taken  this  theoretical  research  one  step  closer  to  practical  application  with 
a  verihcation  of  the  theoretical  results  in  this  thesis  using  a  high-hdelity  simulation 
as  well  as  experimental  validation.  This  chapter  discusses  the  practical  application 
of  this  theoretical  work  as  well  as  the  work  that  was  completed  simultaneously.  The 
chapter  concludes  with  recommendations  for  future  work. 

5.2  Application  of  Theory 

Chapter  I  proposed  a  futuristic  scenario  where  autonomous  munitions  are  trusted 
to  perform  battlespace  search  and  attack  operations.  In  that  scenario  the  munition 
may  calculate  it’s  optimal  parameter  variations  in  real  time  or  the  optimal  schedule 
may  be  predetermined.  In  any  case,  the  munition  would  perform  an  optimal  bat¬ 
tlespace  sweep.  The  futuristic  scenario  is  the  direct  application  of  the  optimal  sensor 
threshold  schedule  solution.  However,  there  is  a  practical  application  of  this  theory 
that  can  be  realized  now.  The  scenario  models  are  realistic  mathematical  representa¬ 
tions  of  battlespace  search  and  attack  operating  areas.  Also,  the  Poisson  distribution 
conveniently  provides  an  accurate  mathematical  model  for  encountering  randomly  dis¬ 
tributed  false  targets.  Despite  intelligence  efforts,  battlespace  environments  remain 
highly  uncertain  and,  at  times,  unpredictable.  Thus,  the  probabilistic  approach  and 
stochastic  element  introduced  by  the  confusion  matrix  and  ROC  concept  is  often  the 
best  representation  of  a  search  and  attack  battlespace. 

In  light  of  the  fact  that  this  theory  is  a  fairly  good  representation  of  the  real 
world,  the  optimal  solutions  derived  in  this  thesis  provide  a  baseline  for  current  op¬ 
erations.  Commanders  and  others  who  depend  on  munitions  or  other  vehicles,  both 
manned  and  unmanned,  to  perform  search  and/or  attack  type  missions  have  a  proba¬ 
bilistic  expectation  for  the  performance  of  their  systems  operating  in  the  battlespace. 
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The  woe  is  the  performance  characteristic  that  can  be  used  to  gauge  the  probability 
of  mission  success  and  make  wiser  decisions  regarding  the  employment  of  expensive 
agents  to  perform  the  search  and  attack  function.  One  can  readily  assess  the  value 
of  the  objective  and  the  probability  that  the  objective  will  be  successfully  completed 
against  the  value  of  the  munition.  This  usefulness  is  itself  one  of  the  most  valuable 
outcomes  of  this  research.  Indeed,  even  modern  day  manned  aircraft  have  a  quantifi¬ 
able  sensor  characterization  (quantihed  by  c).  Thus,  current  manned  flying  operations 
could  use  this  theory  to  obtain  the  probability  of  mission  success  without  ever  leaving 
the  ground. 

5.3  Concurrent  Work  in  Simulation  and  Experimentation 

Concurrent  research  at  AFIT  in  the  field  regarded  by  this  thesis  produced  sat¬ 
isfying  corroboration  in  simulation  and  experimentation.  The  Air  Force  Research 
Lab  (AFRL)  maintains  a  high-hdelity  UAV  simulation  named  “Multi-UAV”  that  was 
used  by  Captain  Michael  Marlin  in  related  thesis  research  [10] .  One  of  the  results  that 
Capt  Marlin  produced  was  Monte  Carlo  simulations  that  duplicated  the  performance 
characteristic  solved  analytically  in  this  work.  The  simulation  was  able  to  duplicate 
a  stochastic  battlespace  defined  by  scenario  1.  A  munition  of  varying  sensor  ability 
(quantified  by  c)  was  flown  in  simulation  against  various  numbers  of  false  targets 
(quantihed  by  \ft)  and  the  statistical  frequency  of  mission  success  (attack  of  the 
intended  target)  closely  agreed  with  the  probability  determined  by  the  theoretical 
algorithm  in  this  thesis. 

The  other  concurrent  thesis  that  was  accomplished  during  this  time  frame  was 
an  experimental  validation  of  the  concepts  related  to  this  thesis  [11].  An  experimental 
testbed  was  developed  which  entailed  a  remote-controlled  car  with  a  camera  sensor.  A 
car  was  chosen  to  simplify  the  problem  from  three  dimensions  to  two.  The  car  repre¬ 
sented  a  UAV  hying  at  a  constant  altitude.  The  camera  was  able  to  distinguish  colors 
and  a  threshold  of  the  number  of  pixels  present  in  the  held  of  view  was  established 
as  the  sensor  threshold  analog  in  this  thesis.  Diherent  types  of  targets  (i.e.  true  vs. 
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false  targets)  were  created  from  different  sized  shapes  of  the  color  to  which  the  sen¬ 
sor  threshold  was  sensitive.  The  mock-mnnition  was  characterized  by  observing  the 
freqnency  of  correct  vs.  incorrect  classifications  at  different  sensor  threshold  levels. 
Essentially,  each  trial  prodnced  a  separate  confusion  matrix.  As  mentioned  in  this 
thesis,  every  search  system  has  an  associated  c;  the  value  of  c  for  the  experimental 
car  was  identified  by  tuning  the  variable  c  in  the  ROC  equation  (equation  2.9)  to 
produce  the  best  fit  ROC  for  the  experimental  sensor  setup.  The  testbed  established 
and  described  in  [11]  is  a  real-world,  reproduction  of  the  theoretical  results  proposed 
in  this  thesis  and  validated  in  simulation. 

5.4  Recommendations  for  Future  Work 

There  are  ample  opportunities  for  future  research  extending  the  results  of  this 
thesis.  First,  the  optimal  results  from  this  theoretical  work  should  be  combined  with 
the  optimal  decision  rules  developed  in  Gillen’s  and  Dunkel’s  theses.  The  ultimate 
goal  is  to  produce  autonomous  munitions  that  operate  optimally  by  themselves  and 
as  part  of  a  cooperative  swarm. 

In  addition  the  experimental  testbed  begun  by  Capt  Rufa  should  definitely  be 
continued.  There  is  a  great  deal  of  additional  research  that  can  be  accomplished  to 
the  end  of  reproducing  the  actions  of  an  autonomous  munition,  or  even  better  yet 
multiple,  cooperating,  autonomous  munitions.  The  theory  developed  in  this  thesis 
can  be  used  to  predict  the  performance  of  the  experimental  testbed.  Evaluating 
the  theoretical  prediction  and  the  experimental  result  in  concert  will  inevitably  shed 
substantial  light  on  the  utility  of  the  practical  application  of  this  theory. 

5. 5  Conclusion 

This  thesis  provides  the  justification  for  optimizing  search  and  attack  agent 
performance  in  a  stochastic  battlespace  with  false  targets,  and  develops  and  solves 
the  optimal  control  problem  maximizing  the  performance  of  the  agent.  The  math¬ 
ematical  foundation  for  the  optimal  control  problem  is  sound  and  furthermore  the 
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mathematical  assumptions  forming  the  foundation  of  this  problem  are  not  made  in  a 
vacuum.  Readily  available  battlespace  intelligence  and  present  and  foreseeable  search 
and  attack  activities  form  the  backbone  and  justification  for  the  theory.  With  a  rea¬ 
sonable  estimate  as  to  the  expected  number  of  false  targets  in  the  battlespace  area, 
and  a  good  characterization  of  the  sensor /platform  package  one  may  confidently  gen¬ 
erate  an  expected  probability  that  a  given  target  will  be  attacked  by  an  autonomous 
munition  as  well  as  the  probability  that  any  undesired  objects  might  be  attacked. 
The  aggregation  of  the  true  target  and  corresponding  false  target  probabilities  forms 
the  weapon  operating  characteristic,  the  performance  metric  for  a  search  and  attack 
munition  in  a  stochastic  battlespace  with  false  targets. 
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